2020-02-05
§
|
14:32 |
<_joe_> |
restarting mcrouter at nice -19 on mw1331 for testing effects of that change |
[production] |
14:30 |
<vgutierrez> |
upload acme-chief 0.24 to apt.wm.o (buster) - T244236 |
[production] |
14:26 |
<XioNoX> |
push inital flowspec config to all routers |
[production] |
14:23 |
<vgutierrez> |
pooling cp5006 - T242093 |
[production] |
14:13 |
<ema> |
cp1075: back to leaving Accept-Encoding as it is due to unrelated applayer issues T242478 |
[production] |
13:46 |
<marostegui> |
Decrease buffer pool size on db1107 for testing - T242702 |
[production] |
13:45 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
13:43 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:42 |
<akosiaris> |
undo the manually set 10.2.1.42 eventgate-analytics.discovery.wmnet in /etc/hosts for mw1331, mw1348. Verify hypothesis that this should cause increased latency. Restart php-fpm |
[production] |
13:41 |
<ema> |
cp1075: unset Accept-Encoding on origin server requests T242478 |
[production] |
13:39 |
<Amir1> |
EU SWAT is done |
[production] |
13:38 |
<ema> |
cp: disable puppet and merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/570311/ T242478 |
[production] |
13:35 |
<XioNoX> |
rollback traffic steering off cr2-eqord |
[production] |
13:29 |
<akosiaris> |
manually set 10.2.1.42 eventgate-analytics.discovery.wmnet in /etc/hosts for mw1331, mw1348. Verify hypothesis that this should cause increased latency |
[production] |
13:25 |
<XioNoX> |
reboot cr2-eqord for software upgrade - yaaaaa |
[production] |
13:24 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.18/extensions/Wikibase/lib/includes/Store/CachingPropertyInfoLookup.php: SWAT: [[gerrit:570301|Cache PropertyInfoLookup internally]] (T243955) (duration: 01m 07s) |
[production] |
13:17 |
<XioNoX> |
increase ospf cost for cr2-eqord links |
[production] |
13:16 |
<vgutierrez> |
upload acme-chief 0.23 to apt.wm.o (buster) - T244236 |
[production] |
13:15 |
<XioNoX> |
disable transit/peering BGP sessions on cr2-eqord |
[production] |
13:15 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.16/extensions/Wikibase/lib/includes/Store/CachingPropertyInfoLookup.php: SWAT: [[gerrit:570301|Cache PropertyInfoLookup internally]] (T243955) (duration: 01m 07s) |
[production] |
13:10 |
<XioNoX> |
rollback: disable transit/peering BGP sessions on cr2-eqdfw |
[production] |
13:08 |
<vgutierrez> |
depooling & reimaging cp5006 as buster - T242093 |
[production] |
13:03 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: 5cc2b70: wgLogoHD and $wgVectorPrintLogo is replaced with wgLogos (T232140) (duration: 01m 06s) |
[production] |
13:01 |
<XioNoX> |
reboot cr2-eqdfw for software upgrade |
[production] |
13:00 |
<Amir1> |
SWAT needs more time |
[production] |
12:55 |
<XioNoX> |
disable transit/peering BGP sessions on cr2-eqdfw |
[production] |
12:50 |
<urbanecm@deploy1001> |
Synchronized wmf-config/CommonSettings.php: SWAT: d450288: wgLogoHD and $wgVectorPrintLogo is replaced with wgLogos (T232140) (duration: 01m 07s) |
[production] |
12:48 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: 5cc2b70: wgLogoHD and $wgVectorPrintLogo is replaced with wgLogos (T232140) (duration: 01m 07s) |
[production] |
12:32 |
<awight@deploy1001> |
Synchronized php-1.35.0-wmf.18/extensions/Cite: SWAT: [[gerrit:570285|Revert follow standardization (T240858)]] (duration: 01m 13s) |
[production] |
10:53 |
<akosiaris> |
rolling restart of all pods on kubernetes staging cluster to make sure everything is fine after the upgrade |
[production] |
10:50 |
<akosiaris> |
T244335 upgrade kubernetes-node on kubestage1002.eqiad.wmnet to 1.13.12 |
[production] |
10:43 |
<ema> |
cp4028: varnish-frontend-restart T243634 |
[production] |
10:24 |
<akosiaris> |
T244335 upgrade kubernetes-master on neon.eqiad.wmnet (staging) |
[production] |
10:24 |
<effie> |
Upload php-apcu_5.1.17+4.0.11-1+0~20190217111312.9+stretch~1.gbp192528+wmf2 - T236800 |
[production] |
10:10 |
<Urbanecm> |
Run mwscript deleteEqualMessages.php --delete to delete GrowthExperiments' message overrides (cswiki, viwiki, arwiki, kowiki) |
[production] |
09:57 |
<akosiaris> |
upload kubernetes 1.13.12 to apt.wikimedia.org stretch-wikimedia/main T244335 |
[production] |
09:51 |
<effie> |
install libmemcached-tools on mc-gp* servers - T240684 |
[production] |
09:05 |
<ema> |
add individual FortiGate IPs hitting ulsfo (currently cp4028) to vcl blocked_nets -- trying to identify problematic traffic T243634 |
[production] |
07:02 |
<marostegui> |
Replay s1 traffic on db1107 (10.4) T242702 |
[production] |
06:32 |
<elukey> |
force a puppet run on ores* hosts |
[production] |
06:12 |
<marostegui> |
Remove partitions from revision table db1098:3317 - T239453 |
[production] |
06:09 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1098:3317 - T239453', diff saved to https://phabricator.wikimedia.org/P10312 and previous config saved to /var/cache/conftool/dbconfig/20200205-060942-marostegui.json |
[production] |
06:09 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2085:3311, db2086:3317 - T239453', diff saved to https://phabricator.wikimedia.org/P10311 and previous config saved to /var/cache/conftool/dbconfig/20200205-060911-marostegui.json |
[production] |
02:38 |
<cdanis> |
T243634 ✔️ cdanis@cp4030.ulsfo.wmnet ~ 🕤🍺 sudo varnish-frontend-restart |
[production] |