2701-2750 of 10000 results (29ms)
2020-09-08 §
10:11 <jmm@cumin2001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
10:11 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single [production]
10:08 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after reboot. T261389', diff saved to https://phabricator.wikimedia.org/P12519 and previous config saved to /var/cache/conftool/dbconfig/20200908-100852-kormat.json [production]
09:52 <akosiaris> enable puppet, run it on all k8s eqiad nodes and double check that calico-node is fine T239835 [production]
09:43 <akosiaris> stopped calico-node and kube-apiserver on k8s nodes/masters T239835 [production]
09:43 <marostegui> Stop mysql on es2014 to clone es2026 T261717 [production]
09:39 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es2014 - T261717', diff saved to https://phabricator.wikimedia.org/P12517 and previous config saved to /var/cache/conftool/dbconfig/20200908-093957-marostegui.json [production]
09:37 <volans> running homer 'cr*eqiad*' commit "Update debmonitor IPs (#2), T261489" [production]
09:33 <jayme@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:33 <jayme@cumin1001> START - Cookbook sre.hosts.downtime [production]
09:28 <kormat@cumin1001> dbctl commit (dc=all): 'Rebooting for T261389', diff saved to https://phabricator.wikimedia.org/P12515 and previous config saved to /var/cache/conftool/dbconfig/20200908-092755-kormat.json [production]
09:27 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:27 <kormat@cumin1001> START - Cookbook sre.hosts.downtime [production]
09:20 <jayme> disabling puppted on argon.eqiad.wmnet,chlorine.eqiad.wmnet,kubernetes[1001-1016].eqiad.wmnet - Reinitialize eqiad k8s cluster with new etcd - T239835 [production]
08:55 <marostegui> Deploy schema change on s7 eqiad master - T253276 [production]
08:48 <marostegui@cumin1001> dbctl commit (dc=all): 'Reduce db2127's weight', diff saved to https://phabricator.wikimedia.org/P12514 and previous config saved to /var/cache/conftool/dbconfig/20200908-084834-marostegui.json [production]
08:45 <volans> running homer 'cr*eqiad*' commit "Update debmonitor IPs, T261489" [production]
08:23 <akosiaris@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=blubberoid,name=eqiad [production]
08:22 <oblivian@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=restbase-async,name=eqiad [production]
08:21 <oblivian@cumin1001> conftool action : set/pooled=true; selector: dnsdisc=restbase-async,name=codfw [production]
08:20 <oblivian@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=eventgate-main,name=eqiad [production]
08:16 <moritzm> installing 4.19.132 kernel on buster systems (only installing the deb, reboots separately) [production]
07:44 <urbanecm@deploy1001> Synchronized private/PrivateSettings.php: Revert "Update T250887 mitigations" (T250887; T262242) (duration: 00m 59s) [production]
07:44 <elukey> roll restart kafka daemons on kafka-jumbo100[7-9] to pick up opendjk upgrades [production]
07:40 <XioNoX> move HE from ix to transit BGP group on cr3-eqsin [production]
07:00 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [production]
06:58 <marostegui> Deploy schema change on s2 eqiad master - T253276 [production]
06:58 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [production]
06:56 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [production]
06:50 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1106 for PDU maintenance', diff saved to https://phabricator.wikimedia.org/P12513 and previous config saved to /var/cache/conftool/dbconfig/20200908-065022-marostegui.json [production]
06:47 <oblivian@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [production]
06:31 <marostegui> Deploy schema change on s5 eqiad master - T253276 [production]
06:23 <elukey> roll restart of Hadoop master daemons on an-master100[1,2] to pick up new opejdk settings [production]
06:14 <marostegui> Stop MySQL on db1106 for PDU maintenance T261452 [production]
05:34 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
05:32 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
2020-09-07 §
23:35 <Reedy> Deployed patch for T262213 [production]
21:19 <reedy@deploy1001> Synchronized private/PrivateSettings.php: Remove old mitigation (duration: 00m 55s) [production]
18:04 <urbanecm@deploy1001> Synchronized private/PrivateSettings.php: Update T250887 mitigations (duration: 00m 56s) [production]
16:12 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
16:10 <elukey@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:38 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after reboot. T261389', diff saved to https://phabricator.wikimedia.org/P12511 and previous config saved to /var/cache/conftool/dbconfig/20200907-153857-kormat.json [production]
15:32 <kormat@cumin1001> dbctl commit (dc=all): 'Rebooting for T261389', diff saved to https://phabricator.wikimedia.org/P12510 and previous config saved to /var/cache/conftool/dbconfig/20200907-153206-kormat.json [production]
15:32 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
15:32 <kormat@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:21 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after reboot. T261389', diff saved to https://phabricator.wikimedia.org/P12509 and previous config saved to /var/cache/conftool/dbconfig/20200907-152117-kormat.json [production]
15:17 <kormat@cumin1001> dbctl commit (dc=all): 'Rebooting for T261389', diff saved to https://phabricator.wikimedia.org/P12508 and previous config saved to /var/cache/conftool/dbconfig/20200907-151718-kormat.json [production]
15:17 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
15:17 <kormat@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:14 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) [production]