5351-5400 of 10000 results (44ms)
2021-02-04 ยง
14:37 <jiji@cumin1001> START - Cookbook sre.hosts.reboot-single for host mc2019.codfw.wmnet [production]
14:30 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5002.eqsin.wmnet [production]
14:28 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir2002.codfw.wmnet [production]
14:22 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host ncredir2002.codfw.wmnet [production]
14:21 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host ganeti5002.eqsin.wmnet [production]
14:21 <godog> roll-restart rsync/swift-object-replicator in codfw to apply memory limits [production]
14:21 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4001.ulsfo.wmnet [production]
14:18 <effie> start rolling reboots of mc[2019-2027,2029-2037].codfw.wmnet T273278 [production]
14:16 <mbsantos@deploy1001> Finished deploy [kartotherian/deploy@47fc426]: (no justification provided) (duration: 00m 12s) [production]
14:16 <mbsantos@deploy1001> Started deploy [kartotherian/deploy@47fc426]: (no justification provided) [production]
14:15 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host ncredir4001.ulsfo.wmnet [production]
14:14 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4002.ulsfo.wmnet [production]
14:14 <moritzm> installing ffmpeg security updates on stretch [production]
14:11 <mbsantos@deploy1001> Finished deploy [kartotherian/deploy@0a38bc5]: (no justification provided) (duration: 00m 03s) [production]
14:11 <mbsantos@deploy1001> Started deploy [kartotherian/deploy@0a38bc5]: (no justification provided) [production]
14:10 <mbsantos@deploy1001> Finished deploy [tilerator/deploy@46a2eaf]: (no justification provided) (duration: 00m 13s) [production]
14:10 <mbsantos@deploy1001> Started deploy [tilerator/deploy@46a2eaf]: (no justification provided) [production]
14:07 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host ncredir4002.ulsfo.wmnet [production]
14:05 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir5001.eqsin.wmnet [production]
13:58 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: NO-OP: 7c67b2f03cbc27cf9e5f214a6f0ea0856d8c1ae4: bnwiki: wgGEHelpPanelLinks: Remove text in brackets (T266020) (duration: 01m 12s) [production]
13:51 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host ncredir5001.eqsin.wmnet [production]
13:50 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir5002.eqsin.wmnet [production]
13:44 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host ncredir5002.eqsin.wmnet [production]
13:44 <vgutierrez> rolling restart of ncredir instances (kernel upgrade) [production]
13:36 <moritzm> installing openldap security updates on buster (client-side tools/libs only, slapd instance already updated) [production]
13:31 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1157.eqiad.wmnet with reason: REIMAGE [production]
13:31 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwdebug1003.eqiad.wmnet [production]
13:31 <godog> reboot logstash2005.codfw.wmnet, no ssh / stuck [production]
13:29 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1157.eqiad.wmnet with reason: REIMAGE [production]
13:29 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host mwdebug1003.eqiad.wmnet [production]
13:10 <jbond42> upload cas_6.2.7 to downgrade cas T273867 [production]
13:04 <ariel@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1010.eqiad.wmnet with reason: REIMAGE [production]
13:02 <ariel@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1010.eqiad.wmnet with reason: REIMAGE [production]
12:27 <moritzm> installing libdatetime-timezone-perl updates on Buster [production]
12:17 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: reboot [production]
12:17 <jmm@cumin2001> START - Cookbook sre.hosts.downtime for 4:00:00 on 17 hosts with reason: reboot [production]
12:17 <moritzm> rebooting mw[1264-1268,1276-1277,1337-1338,1404-1409,1411,1413].eqiad.wmnet for kernel update [production]
12:08 <godog> bounce rsyslog on centrallog1001 [production]
11:47 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: dc=eqiad,cluster=maps,service=kartotherian,name=maps1009.eqiad.wmnet [production]
11:47 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: dc=eqiad,cluster=maps,service=kartotherian-ssl,name=maps1009.eqiad.wmnet [production]
11:30 <elukey@cumin1001> END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) [production]
11:26 <elukey@cumin1001> START - Cookbook sre.aqs.roll-restart [production]
11:07 <elukey@puppetmaster1001> conftool action : set/pooled=true; selector: dnsdisc=eventstreams-internal [production]
10:35 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 93 hosts with reason: reboot [production]
10:35 <moritzm> rebooting mw[2261-2262,2268-2271,2273-2277,2283-2288,2290-2335,2337-2339,2350-2376].codfw.wmnet [production]
10:34 <jmm@cumin2001> START - Cookbook sre.hosts.downtime for 4:00:00 on 93 hosts with reason: reboot [production]
10:23 <marostegui@cumin1001> dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Slowly pooling db1173 for the first time in s6', diff saved to https://phabricator.wikimedia.org/P14204 and previous config saved to /var/cache/conftool/dbconfig/20210204-102312-root.json [production]
10:15 <elukey> restart pybal on lvs1015 (low-traffic active) to pick up new changes for eventstreams-internal (new VIP) - T269160 [production]
10:13 <elukey> restart pybal on lvs2009 (low-traffic active) to pick up new changes for eventstreams-internal (new VIP) - T269160 [production]
10:08 <elukey> restart pybal on lvs1016 (low-traffic standby) to pick up new changes for eventstreams-internal (new VIP) - T269160 [production]