2020-02-05
ยง
|
20:09 |
<twentyafterfour> |
Preparing to deploy wmf/1.35.0-wmf.18 to group1 wikis refs T233866 |
[production] |
20:09 |
<moritzm> |
installing git security updates for jessie |
[production] |
20:00 |
<moritzm> |
installing unzip security updates |
[production] |
19:44 |
<mutante> |
LDAP - added spramduya to wmf group (T243802) |
[production] |
19:38 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Clean up VisualEditor settings (duration: 01m 07s) |
[production] |
19:38 |
<ebernhardson> |
restart mjolnir-kafka-bulk-daemon across eqiad, daemons appear stuck and not reading new messages |
[production] |
19:19 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: T238029 Enable InukaPageView logging on production Wikipedias (duration: 01m 07s) |
[production] |
19:15 |
<jforrester@deploy1001> |
Synchronized wmf-config/CommonSettings.php: Sync back revert of 975b4bbb9 (duration: 01m 06s) |
[production] |
19:10 |
<jforrester@deploy1001> |
scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) |
[production] |
18:35 |
<vgutierrez> |
pooling cp5012 - T242093 |
[production] |
18:23 |
<vgutierrez> |
rebooting cp5012 - T242093 |
[production] |
18:21 |
<elukey> |
restart memcached on mc1025 with 8 threads (rollback - revert https://gerrit.wikimedia.org/r/#/c/570370/, run puppet, restart memcached) |
[production] |
17:51 |
<mutante> |
ganeti1017 - rebooting (not in use yet) |
[production] |
17:34 |
<reedy@deploy1001> |
Synchronized php-1.35.0-wmf.18/languages/: T244300 (duration: 01m 13s) |
[production] |
17:33 |
<reedy@deploy1001> |
Synchronized php-1.35.0-wmf.18/includes/: T244300 (duration: 01m 14s) |
[production] |
16:53 |
<urandom> |
Sessionstore deployment (mediawiki-config) is done |
[production] |
16:37 |
<ppchelko@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:569678]] Config: Enable sessionstore on group0 and 1 T243106 (duration: 01m 08s) |
[production] |
16:25 |
<jforrester@deploy1001> |
Synchronized wmf-config/CommonSettings.php: T232140 Restore wgLogoHD to wikis without a MinervaCustomLogos defined (duration: 01m 09s) |
[production] |
16:07 |
<elukey> |
update puppet compiler's facts |
[production] |
15:54 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
15:52 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
15:29 |
<effie> |
restart php-fpm on canaries - T236800 |
[production] |
15:24 |
<effie> |
Rollout php-apcu_5.1.17+4.0.11-1+0~20190217111312.9+stretch~1.gbp192528+wmf2 to api, app and jobrunner canaries - T236800 |
[production] |
15:15 |
<vgutierrez> |
depooling & reimaging cp5012 as buster - T242093 |
[production] |
15:12 |
<ema> |
cp: unset Accept-Encoding from ats-be requests to applayer T242478 |
[production] |
14:34 |
<vgutierrez> |
updating acme-chief to version 0.24 - T244236 |
[production] |
14:32 |
<_joe_> |
restarting mcrouter at nice -19 on mw1331 for testing effects of that change |
[production] |
14:30 |
<vgutierrez> |
upload acme-chief 0.24 to apt.wm.o (buster) - T244236 |
[production] |
14:26 |
<XioNoX> |
push inital flowspec config to all routers |
[production] |
14:23 |
<vgutierrez> |
pooling cp5006 - T242093 |
[production] |
14:13 |
<ema> |
cp1075: back to leaving Accept-Encoding as it is due to unrelated applayer issues T242478 |
[production] |
13:46 |
<marostegui> |
Decrease buffer pool size on db1107 for testing - T242702 |
[production] |
13:45 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
13:43 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:42 |
<akosiaris> |
undo the manually set 10.2.1.42 eventgate-analytics.discovery.wmnet in /etc/hosts for mw1331, mw1348. Verify hypothesis that this should cause increased latency. Restart php-fpm |
[production] |
13:41 |
<ema> |
cp1075: unset Accept-Encoding on origin server requests T242478 |
[production] |
13:39 |
<Amir1> |
EU SWAT is done |
[production] |
13:38 |
<ema> |
cp: disable puppet and merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/570311/ T242478 |
[production] |
13:35 |
<XioNoX> |
rollback traffic steering off cr2-eqord |
[production] |
13:29 |
<akosiaris> |
manually set 10.2.1.42 eventgate-analytics.discovery.wmnet in /etc/hosts for mw1331, mw1348. Verify hypothesis that this should cause increased latency |
[production] |
13:25 |
<XioNoX> |
reboot cr2-eqord for software upgrade - yaaaaa |
[production] |
13:24 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.18/extensions/Wikibase/lib/includes/Store/CachingPropertyInfoLookup.php: SWAT: [[gerrit:570301|Cache PropertyInfoLookup internally]] (T243955) (duration: 01m 07s) |
[production] |
13:17 |
<XioNoX> |
increase ospf cost for cr2-eqord links |
[production] |
13:16 |
<vgutierrez> |
upload acme-chief 0.23 to apt.wm.o (buster) - T244236 |
[production] |
13:15 |
<XioNoX> |
disable transit/peering BGP sessions on cr2-eqord |
[production] |
13:15 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.16/extensions/Wikibase/lib/includes/Store/CachingPropertyInfoLookup.php: SWAT: [[gerrit:570301|Cache PropertyInfoLookup internally]] (T243955) (duration: 01m 07s) |
[production] |
13:10 |
<XioNoX> |
rollback: disable transit/peering BGP sessions on cr2-eqdfw |
[production] |
13:08 |
<vgutierrez> |
depooling & reimaging cp5006 as buster - T242093 |
[production] |
13:03 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: 5cc2b70: wgLogoHD and $wgVectorPrintLogo is replaced with wgLogos (T232140) (duration: 01m 06s) |
[production] |
13:01 |
<XioNoX> |
reboot cr2-eqdfw for software upgrade |
[production] |