2020-09-02
§
|
10:18 |
<XioNoX> |
move VRRP master from cr1 to cr2 |
[production] |
10:16 |
<XioNoX> |
drain cr1-eqiad transit/transport/IX |
[production] |
10:13 |
<XioNoX> |
drain cr1-eqiad-pfw3-eqiad link |
[production] |
10:04 |
<XioNoX> |
repool cr2-eqiad |
[production] |
09:55 |
<XioNoX> |
cr2-eqiad:request chassis routing-engine master switch - T259621 |
[production] |
09:46 |
<XioNoX> |
reboot cr2-eqiad:re0 (backup) - T259621 |
[production] |
09:28 |
<XioNoX> |
cr2-eqiad:request chassis routing-engine master switch - T259621 |
[production] |
09:19 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
09:18 |
<XioNoX> |
reboot cr2-eqiad:re1 (backup) - T259621 |
[production] |
09:16 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
09:13 |
<aborrero@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
09:13 |
<ayounsi@cumin1001> |
END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=1) |
[production] |
09:12 |
<ayounsi@cumin1001> |
START - Cookbook sre.network.prepare-upgrade |
[production] |
09:11 |
<aborrero@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
09:08 |
<ayounsi@cumin1001> |
END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=1) |
[production] |
09:07 |
<ayounsi@cumin1001> |
START - Cookbook sre.network.prepare-upgrade |
[production] |
09:06 |
<ayounsi@cumin1001> |
END (ERROR) - Cookbook sre.network.prepare-upgrade (exit_code=97) |
[production] |
09:01 |
<elukey> |
reimage kafka-jumbo1004 to Buster |
[production] |
08:58 |
<ayounsi@cumin1001> |
START - Cookbook sre.network.prepare-upgrade |
[production] |
08:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove db1128 from s10 - T260324', diff saved to https://phabricator.wikimedia.org/P12432 and previous config saved to /var/cache/conftool/dbconfig/20200902-085705-marostegui.json |
[production] |
08:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db1128 into s10 (wikitech) with weight 0 - T260324', diff saved to https://phabricator.wikimedia.org/P12431 and previous config saved to /var/cache/conftool/dbconfig/20200902-085455-marostegui.json |
[production] |
08:52 |
<XioNoX> |
deactivate cr2-eqiad transit/IX - T259621 |
[production] |
08:50 |
<XioNoX> |
drain cr2-eqiad transport links - T259621 |
[production] |
08:20 |
<XioNoX> |
activate Telia BGP in eqiad |
[production] |
07:58 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
07:56 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
07:38 |
<elukey> |
reimage kafka-jumbo1003 to buster |
[production] |
07:28 |
<marostegui> |
Reboot dbstore1003 for kernel upgrade - T261389 |
[production] |
07:27 |
<marostegui> |
Reboot dbstore1003T261389 for kernel upgrade - |
[production] |
07:12 |
<XioNoX> |
configure cr2-eqiad:ae5 as single LACP link to Telia |
[production] |
07:05 |
<marostegui> |
Drop unused grants on m5 T261152 |
[production] |
07:02 |
<elukey> |
reboot kafka-jumbo1002 to pick up new kernel settings |
[production] |
07:00 |
<XioNoX> |
deactivate Telia BGP in eqiad |
[production] |
06:38 |
<elukey> |
powercycle analytics1059 - cpu soft locks on multiple CPUs |
[production] |
06:30 |
<elukey> |
reboot kafka-jumbo1001 to pick up new kernel settings |
[production] |
06:30 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' . |
[production] |
06:29 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'test' . |
[production] |
06:29 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'staging' . |
[production] |
06:21 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' . |
[production] |
06:21 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'test' . |
[production] |
06:21 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'production' . |
[production] |
2020-09-01
§
|
22:39 |
<Urbanecm> |
[urbanecm@mwmaint2001 ~]$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=sysop_itwiki Pierpao (T261722) |
[production] |
17:51 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
17:50 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
17:36 |
<ryankemper> |
wdqs [canary] rollback complete, tests passing now. Will need to dig into source of failure |
[production] |
17:35 |
<ryankemper@deploy1001> |
Finished deploy [wdqs/wdqs@7920fbe]: 0.3.46 (duration: 03m 43s) |
[production] |
17:35 |
<ryankemper> |
`wdqs1003` (the canary instance) is failing tests now, going to rollback |
[production] |
17:32 |
<ryankemper@deploy1001> |
Started deploy [wdqs/wdqs@7920fbe]: 0.3.46 |
[production] |
17:30 |
<ryankemper> |
Starting wdqs deploy |
[production] |
15:56 |
<chasemp> |
labsdb* puppet agent --test; sudo /usr/local/sbin/maintain-views --all-databases --table user --replace-all; sudo /usr/local/sbin/maintain-views --all-databases --table user_old --replace-all |
[production] |