1-50 of 10000 results (20ms)
2026-02-24 ยง
14:43 <arnaudb@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage [production]
14:38 <btullis> failing over HDFS namenode services to an-master1003 for T414948 [analytics]
14:37 <arnaudb@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage [production]
14:34 <slyngshede@cumin1003> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2045.codfw.wmnet with OS trixie [production]
14:29 <slyngshede@cumin1003> START - Cookbook sre.hosts.reimage for host cp2045.codfw.wmnet with OS trixie [production]
14:17 <arnaudb@cumin1003> START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bookworm [production]
14:14 <arnaudb@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bookworm [production]
14:11 <awight@deploy2002> Finished scap sync-world: Backport for [[gerrit:1243047|Subreferencing pilot wikis, phase 2 (T418209)]] (duration: 08m 16s) [production]
14:07 <awight@deploy2002> awight: Continuing with sync [production]
14:05 <awight@deploy2002> awight: Backport for [[gerrit:1243047|Subreferencing pilot wikis, phase 2 (T418209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
14:03 <awight@deploy2002> Started scap sync-world: Backport for [[gerrit:1243047|Subreferencing pilot wikis, phase 2 (T418209)]] [production]
13:54 <arnaudb@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage [production]
13:53 <slyngshede@cumin1003> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp2045.codfw.wmnet with OS trixie [production]
13:50 <arnaudb@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage [production]
13:45 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . [production]
13:44 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . [production]
13:38 <slyngshede@cumin1003> START - Cookbook sre.hosts.reimage for host cp2045.codfw.wmnet with OS trixie [production]
13:30 <arnaudb@cumin1003> START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bookworm [production]
13:29 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts moss-fe[2001-2002].codfw.wmnet [production]
13:29 <mvernon@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
13:29 <mvernon@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-fe[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002" [production]
13:29 <mvernon@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-fe[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002" [production]
13:27 <arnaudb@cumin1003> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gerrit-replica.discovery.wmnet gerrit-spare.discovery.wmnet on all recursors [production]
13:27 <arnaudb@cumin1003> START - Cookbook sre.dns.wipe-cache gerrit-replica.discovery.wmnet gerrit-spare.discovery.wmnet on all recursors [production]
13:27 <arnaudb@dns1004> END - running authdns-update [production]
13:27 <fceratto@cumin1003> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
13:26 <fceratto@dns1004> END - running authdns-update [production]
13:26 <arnaudb@dns1004> START - running authdns-update [production]
13:25 <fceratto@dns1004> START - running authdns-update [production]
13:24 <fceratto@cumin1003> START - Cookbook sre.dns.netbox [production]
13:24 <fceratto@cumin1003> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
13:24 <fceratto@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Deploy manual changes from netbox - fceratto@cumin1003" [production]
13:21 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
13:20 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
13:20 <arnaudb@cumin1003> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gerrit-replica.discovery.wmnet gerrit-spare.discovery.wmnet on all recursors [production]
13:20 <arnaudb@cumin1003> START - Cookbook sre.dns.wipe-cache gerrit-replica.discovery.wmnet gerrit-spare.discovery.wmnet on all recursors [production]
13:15 <fceratto@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Deploy manual changes from netbox - fceratto@cumin1003" [production]
13:14 <brouberol@deploy2002> helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. [production]
13:14 <brouberol@deploy2002> helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. [production]
13:11 <fceratto@cumin1003> START - Cookbook sre.dns.netbox [production]
13:07 <fceratto@dns1004> START - running authdns-update [production]
13:04 <fceratto@dns1004> START - running authdns-update [production]
13:02 <arnaudb@dns1004> START - running authdns-update [production]
13:01 <fceratto@dns1004> START - running authdns-update [production]
12:55 <slyngshede@cumin1003> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp2045.codfw.wmnet with OS trixie [production]
12:52 <fceratto@dns1004> START - running authdns-update [production]
12:40 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' . [production]
12:38 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . [production]
12:38 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . [production]
12:38 <dpogorzelski@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [production]