451-500 of 10000 results (93ms)
2024-07-22 ยง
18:12 <aokoth@cumin1002> START - Cookbook sre.vrts.upgrade on VRTS host vrts2001.codfw.wmnet [production]
18:12 <aokoth@cumin1002> END (ERROR) - Cookbook sre.vrts.upgrade (exit_code=97) on VRTS host vrts2001.codfw.wmnet [production]
18:12 <aokoth@cumin1002> START - Cookbook sre.vrts.upgrade on VRTS host vrts2001.codfw.wmnet [production]
17:42 <cmooney@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
17:42 <cmooney@cumin1002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for new cloudceph nodes - cmooney@cumin1002" [production]
17:41 <cmooney@cumin1002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for new cloudceph nodes - cmooney@cumin1002" [production]
17:33 <cmooney@cumin1002> START - Cookbook sre.dns.netbox [production]
17:32 <cmooney@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephmon1004.eqiad.wmnet with OS bullseye [production]
17:11 <cmooney@cumin1002> START - Cookbook sre.hosts.reimage for host cloudcephmon1004.eqiad.wmnet with OS bullseye [production]
17:09 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on netbox2003.codfw.wmnet with reason: netbox upgrade prep work [production]
17:09 <ayounsi@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on netbox2003.codfw.wmnet with reason: netbox upgrade prep work [production]
17:09 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on netbox1003.eqiad.wmnet with reason: netbox upgrade prep work [production]
17:09 <ayounsi@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on netbox1003.eqiad.wmnet with reason: netbox upgrade prep work [production]
17:09 <ayounsi@cumin1002> END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on netbox1003.eqiad.wmnet with reason: netbox upgrade prep work [production]
17:08 <ayounsi@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on netbox1003.eqiad.wmnet with reason: netbox upgrade prep work [production]
16:37 <sukhe> [doh1001] upgrade anycast-healthchecker to 0.9.8-1+wmf12u1: T370068 [production]
16:32 <cgoubert@cumin1002> conftool action : set/weight=10:pooled=yes; selector: name=(wikikube-worker2035.codfw.wmnet|wikikube-worker2036.codfw.wmnet|wikikube-worker2037.codfw.wmnet|wikikube-worker2038.codfw.wmnet),cluster=kubernetes,service=kubesvc [production]
16:31 <claime> Pooling and uncordoning wikikube-worker2035.codfw.wmnet wikikube-worker2036.codfw.wmnet wikikube-worker2037.codfw.wmnet wikikube-worker2038.codfw.wmnet - T351074 [production]
16:31 <sukhe> restart anycast-hc on durum1001 [production]
16:13 <pt1979@cumin1002> END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cloudcephmon1004.eqiad.wmnet [production]
16:08 <pt1979@cumin1002> START - Cookbook sre.hosts.dhcp for host cloudcephmon1004.eqiad.wmnet [production]
16:02 <elukey> remove /srv/kafka/data/eqiad.resource-purge-3 on kafka-main2001 to force a refetch of data from good replicas and circumvent data corruption - T370574 [production]
15:58 <elukey@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2001.codfw.wmnet with reason: attempt to remove a data dir on disk [production]
15:57 <elukey@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2001.codfw.wmnet with reason: attempt to remove a data dir on disk [production]
15:49 <elukey@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on kafka-test1006.eqiad.wmnet with reason: attempt to remove a data dir on disk [production]
15:49 <elukey@cumin1002> START - Cookbook sre.hosts.downtime for 0:30:00 on kafka-test1006.eqiad.wmnet with reason: attempt to remove a data dir on disk [production]
15:08 <dancy@deploy1002> Finished scap: Backport for [[gerrit:1053752|MWMultiVersion.php: Allow MW_FORCE_VERSION to pin the mw version (T369115)]] (duration: 09m 10s) [production]
15:03 <dancy@deploy1002> dancy: Continuing with sync [production]
15:01 <dancy@deploy1002> dancy: Backport for [[gerrit:1053752|MWMultiVersion.php: Allow MW_FORCE_VERSION to pin the mw version (T369115)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
14:59 <dancy@deploy1002> Started scap sync-world: Backport for [[gerrit:1053752|MWMultiVersion.php: Allow MW_FORCE_VERSION to pin the mw version (T369115)]] [production]
14:26 <zabe@deploy1002> Finished scap: Backport for [[gerrit:1055614|Revert^2 "Set some site names for new-ish wikis" (T363270 T360303 T360310 T363263)]] (duration: 10m 54s) [production]
14:21 <zabe@deploy1002> zabe: Continuing with sync [production]
14:17 <zabe@deploy1002> zabe: Backport for [[gerrit:1055614|Revert^2 "Set some site names for new-ish wikis" (T363270 T360303 T360310 T363263)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
14:15 <zabe@deploy1002> Started scap sync-world: Backport for [[gerrit:1055614|Revert^2 "Set some site names for new-ish wikis" (T363270 T360303 T360310 T363263)]] [production]
14:08 <tchanders@deploy1002> Finished scap: Backport for [[gerrit:1054921|Set Flow to read only on testwiki (T370322)]], [[gerrit:1054625|Enable temporary accounts on testwiki and loginwiki (T348895)]], [[gerrit:1055937|Fix logic for handling enabling temporary accounts (T348895)]] (duration: 07m 11s) [production]
14:03 <tchanders@deploy1002> tchanders: Continuing with sync [production]
14:03 <tchanders@deploy1002> tchanders: Backport for [[gerrit:1054921|Set Flow to read only on testwiki (T370322)]], [[gerrit:1054625|Enable temporary accounts on testwiki and loginwiki (T348895)]], [[gerrit:1055937|Fix logic for handling enabling temporary accounts (T348895)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
14:01 <tchanders@deploy1002> Started scap sync-world: Backport for [[gerrit:1054921|Set Flow to read only on testwiki (T370322)]], [[gerrit:1054625|Enable temporary accounts on testwiki and loginwiki (T348895)]], [[gerrit:1055937|Fix logic for handling enabling temporary accounts (T348895)]] [production]
13:45 <tchanders@deploy1002> tchanders: Continuing with sync [production]
13:42 <tchanders@deploy1002> tchanders: Backport for [[gerrit:1054921|Set Flow to read only on testwiki (T370322)]], [[gerrit:1054625|Enable temporary accounts on testwiki and loginwiki (T348895)]], [[gerrit:1055937|Fix logic for handling enabling temporary accounts (T348895)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
13:39 <tchanders@deploy1002> Started scap sync-world: Backport for [[gerrit:1054921|Set Flow to read only on testwiki (T370322)]], [[gerrit:1054625|Enable temporary accounts on testwiki and loginwiki (T348895)]], [[gerrit:1055937|Fix logic for handling enabling temporary accounts (T348895)]] [production]
13:29 <tchanders@deploy1002> Sync cancelled. [production]
13:25 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on rdb1014.eqiad.wmnet with reason: Hardware issue [production]
13:25 <cgoubert@cumin1002> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on rdb1014.eqiad.wmnet with reason: Hardware issue [production]
13:21 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on netbox1002.eqiad.wmnet with reason: Netbox 3 silencing [production]
13:20 <ayounsi@cumin1002> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on netbox1002.eqiad.wmnet with reason: Netbox 3 silencing [production]
13:20 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on netbox2002.codfw.wmnet with reason: Netbox 3 silencing [production]
13:20 <ayounsi@cumin1002> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on netbox2002.codfw.wmnet with reason: Netbox 3 silencing [production]
13:13 <tchanders@deploy1002> tchanders: Backport for [[gerrit:1054921|Set Flow to read only on testwiki (T370322)]], [[gerrit:1054625|Enable temporary accounts on testwiki and loginwiki (T348895)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
13:11 <tchanders@deploy1002> Started scap sync-world: Backport for [[gerrit:1054921|Set Flow to read only on testwiki (T370322)]], [[gerrit:1054625|Enable temporary accounts on testwiki and loginwiki (T348895)]] [production]