2016-03-03
§
|
10:21 |
<_joe_> |
rolling restart of strongswan on eqiad failing servers |
[production] |
10:17 |
<_joe_> |
restarted strongswan on mc1011 |
[production] |
10:07 |
<gehel> |
elastic1022.eqiad.wmnet: upgrading to 1.7.5, shipping logs to logstash (T122697, T109101) |
[production] |
09:57 |
<volans> |
Added es2014,es2016,es2018 to tendril [ T127330 ] |
[production] |
09:46 |
<jynus> |
schema change finished on all hosts (except delayed slaves) |
[production] |
09:21 |
<_joe_> |
puppet re-enabled everywhere, now troubleshooting ipsec issues |
[production] |
08:59 |
<moritzm> |
repooled scb1002 |
[production] |
08:49 |
<moritzm> |
repooled scb1001, depooling scb1002 for nodejs upgrade |
[production] |
08:35 |
<_joe_> |
disabled puppet across the main redises fleet in order to merge https://gerrit.wikimedia.org/r/271261 safely |
[production] |
08:33 |
<moritzm> |
depooling scb1001 for nodejs upgrade |
[production] |
08:27 |
<jynus> |
altering heartbeat table on all production servers |
[production] |
06:39 |
<ebernhardson> |
upgrade elastic1021.eqiad.wmnet to elasticsearch 1.7.5 |
[production] |
05:47 |
<ebernhardson> |
upgrade elastic1020.eqiad.wmnet to elasticsearch 1.7.5 |
[production] |
05:02 |
<ebernhardson> |
upgrade elastic1019.eqiad.wmnet to elasticseach 1.7.5 |
[production] |
04:05 |
<bblack> |
disabling puppet on caches for a bit, JIC |
[production] |
03:51 |
<ebernhardson> |
upgrade elastic1018.eqiad.wmnet to elasticsearch 1.7.5 |
[production] |
03:13 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Thu Mar 3 03:13:23 UTC 2016 (duration 8m 38s) |
[production] |
03:04 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.15) (duration: 18m 46s) |
[production] |
03:03 |
<ebernhardson> |
upgrade elastic1017.eqiad.wmnet to elasticsearch 1.7.5 |
[production] |
02:28 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.14) (duration: 13m 44s) |
[production] |
02:18 |
<ebernhardson> |
upgrade elastic1016.eqiad.wmnet to elasticserach 1.7.5 |
[production] |
02:03 |
<bd808> |
Events flowing into logstash elasticsearch cluster again after forcing allocation of missing shard replica |
[production] |
01:59 |
<twentyafterfour> |
puppet ran on iridium, no errors. :) |
[production] |
01:54 |
<bd808> |
Deleted logstash-2016.02.03 index to free disk space |
[production] |
01:51 |
<bd808> |
New index not being created due to low disk watermark exceeded on logstash1006 |
[production] |
01:49 |
<bd808> |
Logstash elasticsearch cluster not responsive; investigating |
[production] |
01:48 |
<ebernhardson> |
upgrade elastic1015.eqiad.wmnet to elasticsearch 1.7.5 |
[production] |
01:44 |
<twentyafterfour> |
phabricator is back online |
[production] |
01:21 |
<twentyafterfour> |
manually installed scap package on iridium, will fix in puppet immediately after maintenance is finished |
[production] |
01:18 |
<mutante> |
elastic1013 "dpkg reports broken packages " |
[production] |
01:17 |
<twentyafterfour> |
puppet says "Provider scap3 is not functional on this host" |
[production] |
01:16 |
<twentyafterfour> |
testing puppet on iridium |
[production] |
01:07 |
<mutante> |
iridium - stop apache |
[production] |
01:00 |
<ebernhardson> |
upgrade elastic1014.eqiad.wmnet to elasticsearch 1.7.5 |
[production] |
00:50 |
<twentyafterfour> |
Phabricator will be going down for maintenance around 01:00 UTC (Approximately 10 minutes from now) |
[production] |
00:36 |
<hoo@tin> |
Synchronized php-1.27.0-wmf.15/extensions/TemplateData/: Change default format to null instead of 'inline' (duration: 01m 02s) |
[production] |
00:30 |
<bd808> |
Ran sync-common on mw1025 |
[production] |
00:23 |
<hoo> |
Ran sync-common on mw1025, because it apparently didn't pick up recent changes |
[production] |
00:20 |
<ebernhardson> |
upgrade elastic1013.eqiad.wmnet to elasticsearch 1.7.5 |
[production] |
00:19 |
<hoo> |
Restarted hhvm on mw1025 because of "Cannot access property on non-object in /srv/mediawiki/php-1.27.0-wmf.14/includes/filerepo/LocalRepo.php" |
[production] |
00:14 |
<mutante> |
wikitech: delete /a/backup/public/foo and ./bar cruft |
[production] |
00:10 |
<hoo@tin> |
Synchronized wmf-config/Wikibase-production.php: Set $wgWikimediaBadgesCommonsCategoryProperty to null on commons (T128661) (duration: 01m 09s) |
[production] |
00:00 |
<andrewbogott> |
restarting pdns on labservices1001 |
[production] |
2016-03-02
§
|
23:51 |
<chasemp> |
ran puppet on elastic1012 manually which started a mystery stopped (crashed?) elastic search |
[production] |
22:49 |
<krinkle@tin> |
Synchronized php-1.27.0-wmf.15/includes/api/ApiMain.php: Fix PHP Notice (duration: 01m 17s) |
[production] |
22:28 |
<urandom> |
enabling brotli compression on local_group_wikipedia_T_parsoid_html.data in staging, and forcing rewrite of corresponding tables on xenon : T125906 |
[production] |
21:12 |
<urandom> |
forcing a major compaction on {local_group_wikipedia_T_parsoid_dataW4ULtxs1oMqJ,local_group_wikipedia_T_parsoid_html}.data, xenon.eqiad.wmnet : T125906 |
[production] |
20:53 |
<bblack> |
repooling cp1048, seems unlikely to recrash (rare kernel bug) |
[production] |
20:45 |
<bblack> |
cp1048: depooled in confd, too |
[production] |
20:45 |
<bblack> |
cp1048: unresponsive console, powercycled |
[production] |