2601-2650 of 10000 results (69ms)
2022-12-15 ยง
11:11 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ping3002.esams.wmnet [production]
11:04 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2002.codfw.wmnet [production]
11:00 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ping2002.codfw.wmnet [production]
10:53 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1002.eqiad.wmnet [production]
10:50 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ping1002.eqiad.wmnet [production]
10:49 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2001.codfw.wmnet [production]
10:47 <XioNoX> disable ping offload in eqiad [production]
10:43 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host krb2001.codfw.wmnet [production]
10:42 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2001.wikimedia.org [production]
10:38 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host urldownloader2001.wikimedia.org [production]
10:36 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1002.wikimedia.org [production]
10:34 <jayme> restarted istiod pods in aux-k8s because of T303184 [production]
10:32 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host urldownloader1002.wikimedia.org [production]
09:56 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1001.eqiad.wmnet [production]
09:54 <effie> stopping and masking nutcracker on mw servers - T277183 [production]
09:53 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host acmechief1001.eqiad.wmnet [production]
09:51 <vgutierrez@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host acmechief2001.codfw.wmnet [production]
09:42 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1001.wikimedia.org [production]
09:41 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host acmechief2001.codfw.wmnet [production]
09:40 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet [production]
09:38 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host apt1001.wikimedia.org [production]
09:38 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet [production]
09:37 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet [production]
09:31 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet [production]
09:30 <vgutierrez@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host acmechief-test1001.eqiad.wmnet [production]
09:30 <vgutierrez@cumin1001> START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet [production]
09:27 <elukey@cumin1001> START - Cookbook sre.kafka.reboot-workers for Kafka test-eqiad cluster: Reboot kafka nodes [production]
09:21 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2002.wikimedia.org [production]
09:17 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host urldownloader2002.wikimedia.org [production]
09:15 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1001.wikimedia.org [production]
09:12 <akosiaris> reboot rdb2007 for kernel upgrades [production]
09:10 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host urldownloader1001.wikimedia.org [production]
09:08 <hashar@deploy1002> rebuilt and synchronized wikiversions files: all wikis to 1.40.0-wmf.14 refs T320519 [production]
08:53 <akosiaris> reboot rdb2009 for kernel upgrades [production]
08:52 <akosiaris> correction: reboot rdb1011 for kernel upgrades [production]
08:51 <akosiaris> reboot rdb1007 for kernel upgrades [production]
08:51 <akosiaris> nothing noticed with rdb1007 reboot for mw, jobqueue, api-gateway. changeprop had a minor backlog increase, but everything appears fine now. [production]
08:28 <akosiaris> reboot rdb1009 for kernel upgrades. possibly (but probably not) affected applications: changeprop, cpjobqueue, api-gateway, redisLockManager [production]
08:13 <kartik@deploy1002> Finished scap: Backport for [[gerrit:868215|Enable Section Translation on 6 WPs (T319177)]] (duration: 10m 55s) [production]
08:08 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host puppetdb2003.codfw.wmnet [production]
08:04 <kartik@deploy1002> kartik and kartik: Backport for [[gerrit:868215|Enable Section Translation on 6 WPs (T319177)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet [production]
08:03 <kartik@deploy1002> Started scap: Backport for [[gerrit:868215|Enable Section Translation on 6 WPs (T319177)]] [production]
07:57 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host puppetdb2003.codfw.wmnet [production]
01:46 <cwhite@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2026.codfw.wmnet with OS bullseye [production]
00:58 <cwhite@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2026.codfw.wmnet with reason: host reimage [production]
00:55 <cwhite@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2026.codfw.wmnet with reason: host reimage [production]
00:32 <mutante> releases1002 - rebooting [production]
00:30 <mutante> releases2002 - rebooting [production]
00:19 <cwhite@cumin2002> START - Cookbook sre.hosts.reimage for host logstash2026.codfw.wmnet with OS bullseye [production]
00:19 <cwhite@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host logstash2026.codfw.wmnet with OS bullseye [production]