2751-2800 of 10000 results (110ms)
2023-04-24 §
08:25 <btullis@cumin1001> START - Cookbook sre.hosts.dhcp for host an-worker1110.eqiad.wmnet [production]
08:21 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-worker1110.eqiad.wmnet with reason: Upgrading RAID controller firmware [production]
08:21 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-worker1110.eqiad.wmnet with reason: Upgrading RAID controller firmware [production]
08:20 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 10 hosts with reason: Enabling replication T335266 [production]
08:20 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 0:15:00 on 10 hosts with reason: Enabling replication T335266 [production]
08:20 <marostegui> Enable replication eqiad -> codfw on x1 dbmaint eqiad T335266 [production]
08:18 <cgoubert@deploy2002> Started scap: testing T329857 [production]
08:17 <marostegui> Enable replication eqiad -> codfw on es5 dbmaint eqiad T335266 [production]
08:14 <claime> Deploying 909302 on deploy2002 for T329857 [production]
08:10 <claime> Disabling puppet on deploy2002 - T329857 [production]
08:09 <claime> Deploying 909302 on deploy1002 for T329857 [production]
08:08 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 6 hosts with reason: Enabling replication T335266 [production]
08:08 <marostegui> Enable replication eqiad -> codfw on es4 dbmaint eqiad T335266 [production]
08:08 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 0:15:00 on 6 hosts with reason: Enabling replication T335266 [production]
08:07 <marostegui> Enable replication eqiad -> codfw on pc3 dbmaint eqiad T335266 [production]
08:06 <marostegui> Enable replication eqiad -> codfw on pc2 dbmaint eqiad T335266 [production]
08:05 <marostegui> Enable replication eqiad -> codfw on pc1 dbmaint eqiad T335266 [production]
07:53 <mvernon@cumin2002> END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.41 in codfw [production]
07:51 <mvernon@cumin2002> START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.41 in codfw [production]
07:45 <jelto@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab1004.wikimedia.org with OS bullseye [production]
07:44 <mvernon@cumin2002> END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.59 in codfw [production]
07:42 <mvernon@cumin2002> START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.59 in codfw [production]
07:39 <dcausse> restarting blazegraph on wdqs1005 (stuck for 3+days) [production]
07:38 <mvernon@cumin2002> END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.4a in codfw [production]
07:36 <mvernon@cumin2002> START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.4a in codfw [production]
07:24 <jelto@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage [production]
07:21 <jelto@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage [production]
07:06 <jelto@cumin2002> START - Cookbook sre.hosts.reimage for host gitlab1004.wikimedia.org with OS bullseye [production]
2023-04-22 §
05:41 <joe> <thumbor/codfw>$ helmfile --state-values-set roll_restart=1 -e codfw sync [production]
05:40 <oblivian@deploy2002> helmfile [codfw] DONE helmfile.d/services/thumbor: sync [production]
05:39 <oblivian@deploy2002> helmfile [codfw] START helmfile.d/services/thumbor: sync [production]
05:39 <oblivian@deploy2002> helmfile [codfw] DONE helmfile.d/services/thumbor: apply [production]
05:39 <oblivian@deploy2002> helmfile [codfw] START helmfile.d/services/thumbor: apply [production]
05:15 <hashar@deploy2002> Finished deploy [integration/docroot@b816911]: Update Grafana URL (duration: 00m 11s) [production]
05:15 <hashar@deploy2002> Started deploy [integration/docroot@b816911]: Update Grafana URL [production]
05:10 <joe> sudo cumin -b 1 -s 20 'A:swift-fe-codfw' 'systemctl restart swift-proxy.service' [production]
04:33 <vgutierrez> restart haproxy on cp1087 - T334448 [production]
2023-04-21 §
18:27 <mvernon@cumin2002> END (FAIL) - Cookbook sre.swift.remove-ghost-objects (exit_code=99) from container wikipedia-en-local-public.a8 in codfw [production]
18:25 <mvernon@cumin2002> START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.a8 in codfw [production]
15:57 <lucaswerkmeister-wmde@deploy2002> Finished scap: Backport for [[gerrit:910780|Set wmgUseGraphWithJsonNamespace = true for mediawikiwiki (T124748 T335130)]] (duration: 10m 01s) [production]
15:48 <lucaswerkmeister-wmde@deploy2002> lucaswerkmeister-wmde: Backport for [[gerrit:910780|Set wmgUseGraphWithJsonNamespace = true for mediawikiwiki (T124748 T335130)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
15:47 <lucaswerkmeister-wmde@deploy2002> Started scap: Backport for [[gerrit:910780|Set wmgUseGraphWithJsonNamespace = true for mediawikiwiki (T124748 T335130)]] [production]
12:18 <duesen> reverted monky-patch, mwdebug2001 and deploy2002 are back to wmf/1.41.0-wmf.5 (T335183) [production]
11:56 <duesen> monky-patching Ib11a871ff on mwdebug2001 to investigate T335183 [production]
09:03 <Amir1> finish of the wikibase populate sites table [production]
08:35 <Amir1> start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https [production]
03:19 <eileen> civicrm upgraded from 5b63c2b2 to 0fad720a [production]
03:11 <eileen> civicrm upgraded from a2e7c079 to 5b63c2b2 [production]
01:41 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2011.codfw.wmnet with OS bullseye [production]
01:41 <pt1979@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002" [production]