2020-09-02
§
|
09:06 |
<ayounsi@cumin1001> |
END (ERROR) - Cookbook sre.network.prepare-upgrade (exit_code=97) |
[production] |
09:01 |
<elukey> |
reimage kafka-jumbo1004 to Buster |
[production] |
08:58 |
<ayounsi@cumin1001> |
START - Cookbook sre.network.prepare-upgrade |
[production] |
08:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove db1128 from s10 - T260324', diff saved to https://phabricator.wikimedia.org/P12432 and previous config saved to /var/cache/conftool/dbconfig/20200902-085705-marostegui.json |
[production] |
08:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool db1128 into s10 (wikitech) with weight 0 - T260324', diff saved to https://phabricator.wikimedia.org/P12431 and previous config saved to /var/cache/conftool/dbconfig/20200902-085455-marostegui.json |
[production] |
08:52 |
<XioNoX> |
deactivate cr2-eqiad transit/IX - T259621 |
[production] |
08:50 |
<XioNoX> |
drain cr2-eqiad transport links - T259621 |
[production] |
08:20 |
<XioNoX> |
activate Telia BGP in eqiad |
[production] |
07:58 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
07:56 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
07:38 |
<elukey> |
reimage kafka-jumbo1003 to buster |
[production] |
07:28 |
<marostegui> |
Reboot dbstore1003 for kernel upgrade - T261389 |
[production] |
07:27 |
<marostegui> |
Reboot dbstore1003T261389 for kernel upgrade - |
[production] |
07:12 |
<XioNoX> |
configure cr2-eqiad:ae5 as single LACP link to Telia |
[production] |
07:05 |
<marostegui> |
Drop unused grants on m5 T261152 |
[production] |
07:02 |
<elukey> |
reboot kafka-jumbo1002 to pick up new kernel settings |
[production] |
07:00 |
<XioNoX> |
deactivate Telia BGP in eqiad |
[production] |
06:38 |
<elukey> |
powercycle analytics1059 - cpu soft locks on multiple CPUs |
[production] |
06:30 |
<elukey> |
reboot kafka-jumbo1001 to pick up new kernel settings |
[production] |
06:30 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' . |
[production] |
06:29 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'test' . |
[production] |
06:29 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'staging' . |
[production] |
06:21 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' . |
[production] |
06:21 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'test' . |
[production] |
06:21 |
<oblivian@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'production' . |
[production] |
2020-09-01
§
|
22:39 |
<Urbanecm> |
[urbanecm@mwmaint2001 ~]$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=sysop_itwiki Pierpao (T261722) |
[production] |
17:51 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
17:50 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
17:36 |
<ryankemper> |
wdqs [canary] rollback complete, tests passing now. Will need to dig into source of failure |
[production] |
17:35 |
<ryankemper@deploy1001> |
Finished deploy [wdqs/wdqs@7920fbe]: 0.3.46 (duration: 03m 43s) |
[production] |
17:35 |
<ryankemper> |
`wdqs1003` (the canary instance) is failing tests now, going to rollback |
[production] |
17:32 |
<ryankemper@deploy1001> |
Started deploy [wdqs/wdqs@7920fbe]: 0.3.46 |
[production] |
17:30 |
<ryankemper> |
Starting wdqs deploy |
[production] |
15:56 |
<chasemp> |
labsdb* puppet agent --test; sudo /usr/local/sbin/maintain-views --all-databases --table user --replace-all; sudo /usr/local/sbin/maintain-views --all-databases --table user_old --replace-all |
[production] |
15:25 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:15 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
14:28 |
<_joe_> |
restarting envoy on all eqiad jobrunners |
[production] |
14:22 |
<_joe_> |
restarted confd on mwmaint1002 |
[production] |
14:18 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0) |
[production] |
14:18 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-update-tendril |
[production] |
14:17 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) |
[production] |
14:15 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-start-maintenance |
[production] |
14:15 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Reduce db2083 weight', diff saved to https://phabricator.wikimedia.org/P12429 and previous config saved to /var/cache/conftool/dbconfig/20200901-141521-marostegui.json |
[production] |
14:15 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) |
[production] |
14:14 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-restore-ttl |
[production] |
14:07 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) |
[production] |
14:07 |
<rzl@cumin1001> |
MediaWiki read-only period ends at: 2020-09-01 14:07:36.305500 |
[production] |
14:07 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.07-set-readwrite |
[production] |
14:04 |
<rzl@cumin1001> |
END (FAIL) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=99) |
[production] |
14:04 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.07-set-readwrite |
[production] |