2551-2600 of 10000 results (30ms)
2020-09-02 §
09:46 <XioNoX> reboot cr2-eqiad:re0 (backup) - T259621 [production]
09:28 <XioNoX> cr2-eqiad:request chassis routing-engine master switch - T259621 [production]
09:19 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:18 <XioNoX> reboot cr2-eqiad:re1 (backup) - T259621 [production]
09:16 <elukey@cumin1001> START - Cookbook sre.hosts.downtime [production]
09:13 <aborrero@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:13 <ayounsi@cumin1001> END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=1) [production]
09:12 <ayounsi@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
09:11 <aborrero@cumin2001> START - Cookbook sre.hosts.downtime [production]
09:08 <ayounsi@cumin1001> END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=1) [production]
09:07 <ayounsi@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
09:06 <ayounsi@cumin1001> END (ERROR) - Cookbook sre.network.prepare-upgrade (exit_code=97) [production]
09:01 <elukey> reimage kafka-jumbo1004 to Buster [production]
08:58 <ayounsi@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
08:57 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove db1128 from s10 - T260324', diff saved to https://phabricator.wikimedia.org/P12432 and previous config saved to /var/cache/conftool/dbconfig/20200902-085705-marostegui.json [production]
08:55 <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1128 into s10 (wikitech) with weight 0 - T260324', diff saved to https://phabricator.wikimedia.org/P12431 and previous config saved to /var/cache/conftool/dbconfig/20200902-085455-marostegui.json [production]
08:52 <XioNoX> deactivate cr2-eqiad transit/IX - T259621 [production]
08:50 <XioNoX> drain cr2-eqiad transport links - T259621 [production]
08:20 <XioNoX> activate Telia BGP in eqiad [production]
07:58 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
07:56 <elukey@cumin1001> START - Cookbook sre.hosts.downtime [production]
07:38 <elukey> reimage kafka-jumbo1003 to buster [production]
07:28 <marostegui> Reboot dbstore1003 for kernel upgrade - T261389 [production]
07:27 <marostegui> Reboot dbstore1003T261389 for kernel upgrade - [production]
07:12 <XioNoX> configure cr2-eqiad:ae5 as single LACP link to Telia [production]
07:05 <marostegui> Drop unused grants on m5 T261152 [production]
07:02 <elukey> reboot kafka-jumbo1002 to pick up new kernel settings [production]
07:00 <XioNoX> deactivate Telia BGP in eqiad [production]
06:38 <elukey> powercycle analytics1059 - cpu soft locks on multiple CPUs [production]
06:30 <elukey> reboot kafka-jumbo1001 to pick up new kernel settings [production]
06:30 <oblivian@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' . [production]
06:29 <oblivian@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'test' . [production]
06:29 <oblivian@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'staging' . [production]
06:21 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' . [production]
06:21 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'test' . [production]
06:21 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'production' . [production]
2020-09-01 §
22:39 <Urbanecm> [urbanecm@mwmaint2001 ~]$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=sysop_itwiki Pierpao (T261722) [production]
17:51 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
17:50 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission [production]
17:36 <ryankemper> wdqs [canary] rollback complete, tests passing now. Will need to dig into source of failure [production]
17:35 <ryankemper@deploy1001> Finished deploy [wdqs/wdqs@7920fbe]: 0.3.46 (duration: 03m 43s) [production]
17:35 <ryankemper> `wdqs1003` (the canary instance) is failing tests now, going to rollback [production]
17:32 <ryankemper@deploy1001> Started deploy [wdqs/wdqs@7920fbe]: 0.3.46 [production]
17:30 <ryankemper> Starting wdqs deploy [production]
15:56 <chasemp> labsdb* puppet agent --test; sudo /usr/local/sbin/maintain-views --all-databases --table user --replace-all; sudo /usr/local/sbin/maintain-views --all-databases --table user_old --replace-all [production]
15:25 <pt1979@cumin2001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:15 <pt1979@cumin2001> START - Cookbook sre.dns.netbox [production]
14:28 <_joe_> restarting envoy on all eqiad jobrunners [production]
14:22 <_joe_> restarted confd on mwmaint1002 [production]
14:18 <rzl@cumin1001> END (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0) [production]