401-450 of 10000 results (19ms)
2025-09-08 ยง
11:05 <brouberol@deploy1003> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
11:04 <brouberol@deploy1003> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
11:03 <brouberol@deploy1003> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
11:01 <brouberol@deploy1003> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
10:58 <brouberol@deploy1003> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
10:57 <brouberol@deploy1003> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
10:53 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install3004.wikimedia.org with reason: host reimage [production]
10:52 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doh3006.wikimedia.org [production]
10:48 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host doh3006.wikimedia.org [production]
10:48 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on install3004.wikimedia.org with reason: host reimage [production]
10:46 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Depooling db1169 (T402925)', diff saved to https://phabricator.wikimedia.org/P82690 and previous config saved to /var/cache/conftool/dbconfig/20250908-104652-ladsgroup.json [production]
10:46 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance [production]
10:46 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1163 (T402925)', diff saved to https://phabricator.wikimedia.org/P82689 and previous config saved to /var/cache/conftool/dbconfig/20250908-104629-ladsgroup.json [production]
10:44 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1185 (T401906)', diff saved to https://phabricator.wikimedia.org/P82688 and previous config saved to /var/cache/conftool/dbconfig/20250908-104413-fceratto.json [production]
10:44 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1185.eqiad.wmnet with reason: Maintenance [production]
10:43 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1161 (T401906)', diff saved to https://phabricator.wikimedia.org/P82687 and previous config saved to /var/cache/conftool/dbconfig/20250908-104350-fceratto.json [production]
10:38 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet [production]
10:36 <dcaro@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade_workers (exit_code=0) for tools-k8s-worker-nfs-1, tools-k8s-worker-nfs-10, tools-k8s-worker-nfs-11, tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-13, tools-k8s-worker-nfs-14, tools-k8s-worker-nfs-16, tools-k8s-worker-nfs-17, tools-k8s-worker-nfs-19, tools-k8s-worker-nfs-2, tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-22, tools-k8s-wor [tools]
10:33 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet [production]
10:31 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P82686 and previous config saved to /var/cache/conftool/dbconfig/20250908-103122-ladsgroup.json [production]
10:28 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P82685 and previous config saved to /var/cache/conftool/dbconfig/20250908-102842-fceratto.json [production]
10:26 <dcaro@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.worker.upgrade_workers (exit_code=0) for tools-k8s-worker-nfs-1, tools-k8s-worker-nfs-10, tools-k8s-worker-nfs-11, tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-13, tools-k8s-worker-nfs-14, tools-k8s-worker-nfs-16, tools-k8s-worker-nfs-17, tools-k8s-worker-nfs-19, tools-k8s-worker-nfs-2, tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-22, tools-k8s-wor [tools]
10:20 <dcaro@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.run_tests (exit_code=99) [tools]
10:17 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1003.eqiad.wmnet [production]
10:16 <cmooney@cumin1003> START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin[1002-1003].eqiad.wmnet with reason: Update wmf-plugin IBGP output - cmooney@cumin1003 [production]
10:16 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P82684 and previous config saved to /var/cache/conftool/dbconfig/20250908-101614-ladsgroup.json [production]
10:16 <dcaro@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.worker.upgrade_workers for tools-k8s-worker-nfs-1, tools-k8s-worker-nfs-10, tools-k8s-worker-nfs-11, tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-13, tools-k8s-worker-nfs-14, tools-k8s-worker-nfs-16, tools-k8s-worker-nfs-17, tools-k8s-worker-nfs-19, tools-k8s-worker-nfs-2, tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-22, tools-k8s-worker-nfs-23, tools-k [tools]
10:13 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P82683 and previous config saved to /var/cache/conftool/dbconfig/20250908-101334-fceratto.json [production]
10:11 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host install3004.wikimedia.org with OS bookworm [production]
10:10 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host sretest1003.eqiad.wmnet [production]
10:08 <ayounsi@cumin1003> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow3003.esams.wmnet [production]
10:08 <ayounsi@cumin1003> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
10:08 <ayounsi@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow3003.esams.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003" [production]
10:07 <ayounsi@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow3003.esams.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003" [production]
10:06 <dcaro@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.worker.upgrade_workers for tools-k8s-worker-nfs-1, tools-k8s-worker-nfs-10, tools-k8s-worker-nfs-11, tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-13, tools-k8s-worker-nfs-14, tools-k8s-worker-nfs-16, tools-k8s-worker-nfs-17, tools-k8s-worker-nfs-19, tools-k8s-worker-nfs-2, tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-22, tools-k8s-worker-nfs-23, tools-k [tools]
10:06 <dcaro@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.k8s.worker.upgrade_workers (exit_code=99) for tools-k8s-worker-nfs-1, tools-k8s-worker-nfs-10, tools-k8s-worker-nfs-11, tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-13, tools-k8s-worker-nfs-14, tools-k8s-worker-nfs-16, tools-k8s-worker-nfs-17, tools-k8s-worker-nfs-19, tools-k8s-worker-nfs-2, tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-22, tools-k8s-wo [tools]
10:06 <dcaro@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.worker.upgrade_workers for tools-k8s-worker-nfs-1, tools-k8s-worker-nfs-10, tools-k8s-worker-nfs-11, tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-13, tools-k8s-worker-nfs-14, tools-k8s-worker-nfs-16, tools-k8s-worker-nfs-17, tools-k8s-worker-nfs-19, tools-k8s-worker-nfs-2, tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-22, tools-k8s-worker-nfs-23, tools-k [tools]
10:05 <wmbot~dcaro@acme> END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-68 (T402378) [tools]
10:04 <ayounsi@cumin1003> START - Cookbook sre.dns.netbox [production]
10:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1012.eqiad.wmnet [production]
10:01 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1163 (T402925)', diff saved to https://phabricator.wikimedia.org/P82682 and previous config saved to /var/cache/conftool/dbconfig/20250908-100107-ladsgroup.json [production]
10:01 <dcaro@cloudcumin1001> START - Cookbook wmcs.toolforge.run_tests [tools]
10:00 <dcaro@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.run_tests (exit_code=0) [tools]
09:59 <ayounsi@cumin1003> START - Cookbook sre.hosts.decommission for hosts netflow3003.esams.wmnet [production]
09:59 <wmbot~dcaro@acme> START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-68 (T402378) [tools]
09:58 <wmbot~dcaro@acme> END (ERROR) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=97) for tools-k8s-worker-nfs-68: (T402378) [tools]
09:58 <wmbot~dcaro@acme> START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-68: (T402378) [tools]
09:58 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1161 (T401906)', diff saved to https://phabricator.wikimedia.org/P82681 and previous config saved to /var/cache/conftool/dbconfig/20250908-095826-fceratto.json [production]
09:56 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1161 (T401906)', diff saved to https://phabricator.wikimedia.org/P82680 and previous config saved to /var/cache/conftool/dbconfig/20250908-095602-fceratto.json [production]
09:55 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]