201-250 of 10000 results (67ms)
2023-04-27 ยง
13:45 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache1003.eqiad.wmnet with reason: host reimage [production]
13:35 <akosiaris@deploy1002> helmfile [staging] DONE helmfile.d/services/machinetranslation: apply [production]
13:33 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-cache1003.eqiad.wmnet with OS bullseye [production]
13:31 <akosiaris@deploy1002> helmfile [staging] START helmfile.d/services/machinetranslation: apply [production]
13:30 <akosiaris@deploy1002> helmfile [staging] DONE helmfile.d/services/machinetranslation: apply [production]
13:28 <samtar@deploy1002> samtar and cmelo: Backport for [[gerrit:910056|Enable $wgCampaignEventsEnableMultipleOrganizers in production (T334088)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
13:26 <samtar@deploy1002> Started scap: Backport for [[gerrit:910056|Enable $wgCampaignEventsEnableMultipleOrganizers in production (T334088)]] [production]
13:20 <samtar@deploy1002> Finished scap: Backport for [[gerrit:910055|metawiki: Give campaignevents-organize-events to campaignevents-beta-tester only (T334088)]] (duration: 15m 07s) [production]
13:20 <akosiaris@deploy1002> helmfile [staging] START helmfile.d/services/machinetranslation: apply [production]
13:06 <samtar@deploy1002> samtar and cmelo: Backport for [[gerrit:910055|metawiki: Give campaignevents-organize-events to campaignevents-beta-tester only (T334088)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet [production]
13:05 <samtar@deploy1002> Started scap: Backport for [[gerrit:910055|metawiki: Give campaignevents-organize-events to campaignevents-beta-tester only (T334088)]] [production]
13:04 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache1002.eqiad.wmnet with OS bullseye [production]
12:56 <vgutierrez> restarting varnish on cp5021 and cp5029 to drop port 80 - T322774 [production]
12:43 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache1002.eqiad.wmnet with reason: host reimage [production]
12:40 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache1002.eqiad.wmnet with reason: host reimage [production]
12:29 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-cache1002.eqiad.wmnet with OS bullseye [production]
12:27 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache1001.eqiad.wmnet with OS bullseye [production]
12:12 <moritzm> imported puppet 5.5.22-2+deb13u3 to bookworm-wikimedia T330495 [production]
11:56 <jbond> upload python3-pypuppetdb_3.1.0-1_all.deb to bookworm [production]
11:47 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 23951 [production]
11:44 <ayounsi@cumin1001> START - Cookbook sre.network.peering with action 'configure' for AS: 23951 [production]
11:44 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 54994 [production]
11:41 <krinkle@deploy1002> Synchronized wmf-config/: I195978cbd61d80 (duration: 06m 29s) [production]
11:14 <hnowlan@puppetmaster1001> conftool action : set/weight=6; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet [production]
11:13 <ayounsi@cumin1001> START - Cookbook sre.network.peering with action 'configure' for AS: 54994 [production]
11:09 <hnowlan@deploy2002> helmfile [eqiad] DONE helmfile.d/services/thumbor: apply [production]
11:09 <vgutierrez> restarting varnish on cp5022 and cp5030 to drop port 80 - T322774 [production]
11:07 <hnowlan@deploy2002> helmfile [eqiad] START helmfile.d/services/thumbor: apply [production]
11:03 <hnowlan@deploy2002> helmfile [codfw] DONE helmfile.d/services/thumbor: apply [production]
11:00 <hnowlan@deploy2002> helmfile [codfw] START helmfile.d/services/thumbor: apply [production]
10:59 <hnowlan@deploy2002> helmfile [staging] DONE helmfile.d/services/thumbor: apply [production]
10:59 <hnowlan@deploy2002> helmfile [staging] START helmfile.d/services/thumbor: apply [production]
10:33 <vgutierrez> restarting varnish on cp5023 and cp5031 to drop port 80 - T322774 [production]
10:24 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache1001.eqiad.wmnet with reason: host reimage [production]
10:20 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache1001.eqiad.wmnet with reason: host reimage [production]
10:09 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-cache1001.eqiad.wmnet with OS bullseye [production]
10:05 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1002.wikimedia.org [production]
10:04 <elukey@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ml-cache1001.eqiad.wmnet with OS bullseye [production]
10:01 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host idp-test1002.wikimedia.org [production]
10:00 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1001.eqiad.wmnet [production]
09:55 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-cache1001.eqiad.wmnet with OS bullseye [production]
09:54 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet [production]
09:54 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-cache1001.eqiad.wmnet with OS bullseye [production]
09:43 <cgoubert@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
09:42 <cgoubert@cumin1001> START - Cookbook sre.dns.netbox [production]
09:42 <cgoubert@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2331.codfw.wmnet with reason: PSU failure [production]
09:42 <cgoubert@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2331.codfw.wmnet with reason: PSU failure [production]
09:41 <cgoubert@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 7 days, 0:00:00 on mw2331.codfw.wmnet with reason: PSU failure [production]
09:41 <cgoubert@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2331.codfw.wmnet with reason: PSU failure [production]
09:41 <cgoubert@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2330.codfw.wmnet with reason: PSU failure [production]