101-150 of 10000 results (37ms)
2026-01-19 ยง
12:27 <ayounsi@cumin1003> START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
12:23 <ayounsi@cumin1003> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2003.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
12:18 <cgoubert@cumin1003> START - Cookbook sre.hosts.decommission for hosts wikikube-worker[2019-2032].codfw.wmnet [production]
12:16 <ayounsi@cumin1003> START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
12:16 <cgoubert@cumin1003> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wikikube-worker[2003-2004,2007-2010,2040,2043,2045,2048].codfw.wmnet [production]
12:16 <cgoubert@cumin1003> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
12:16 <cgoubert@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2003-2004,2007-2010,2040,2043,2045,2048].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1003" [production]
12:16 <cgoubert@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[2003-2004,2007-2010,2040,2043,2045,2048].codfw.wmnet decommissioned, removing all IPs except the asset tag one - cgoubert@cumin1003" [production]
12:12 <cgoubert@cumin1003> START - Cookbook sre.dns.netbox [production]
12:08 <ayounsi@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2003.codfw.wmnet [production]
11:59 <ayounsi@cumin1003> START - Cookbook sre.hosts.reboot-single for host sretest2003.codfw.wmnet [production]
11:51 <btullis@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1193.eqiad.wmnet [production]
11:50 <cgoubert@cumin1003> START - Cookbook sre.hosts.decommission for hosts wikikube-worker[2003-2004,2007-2010,2040,2043,2045,2048].codfw.wmnet [production]
11:46 <moritzm> intalling openjpeg2 security updates [production]
11:44 <btullis@cumin1003> START - Cookbook sre.hosts.reboot-single for host an-worker1193.eqiad.wmnet [production]
11:43 <btullis@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1206.eqiad.wmnet [production]
11:43 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-search: apply [production]
11:43 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-search: apply [production]
11:38 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance [production]
11:37 <marostegui@cumin1003> dbctl commit (dc=all): 'Depool db1160 T414542', diff saved to https://phabricator.wikimedia.org/P87748 and previous config saved to /var/cache/conftool/dbconfig/20260119-113722-marostegui.json [production]
11:35 <btullis@cumin1003> START - Cookbook sre.hosts.reboot-single for host an-worker1206.eqiad.wmnet [production]
11:35 <marostegui@cumin1003> dbctl commit (dc=all): 'Promote db1244 to s4 primary T414542', diff saved to https://phabricator.wikimedia.org/P87747 and previous config saved to /var/cache/conftool/dbconfig/20260119-113518-marostegui.json [production]
11:34 <marostegui> Starting s4 eqiad failover from db1160 to db1244 - T414542 [production]
11:34 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply [production]
11:34 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply [production]
11:33 <btullis@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet [production]
11:31 <vgutierrez@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp7004.magru.wmnet [production]
11:31 <vgutierrez@cumin1003> START - Cookbook sre.hosts.remove-downtime for cp7004.magru.wmnet [production]
11:30 <vgutierrez@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 111 hosts [production]
11:29 <vgutierrez@cumin1003> START - Cookbook sre.hosts.remove-downtime for 111 hosts [production]
11:28 <marostegui@cumin1003> dbctl commit (dc=all): 'Set db1244 with weight 0 T414542', diff saved to https://phabricator.wikimedia.org/P87746 and previous config saved to /var/cache/conftool/dbconfig/20260119-112825-marostegui.json [production]
11:28 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 T414542 [production]
11:26 <btullis@cumin1003> START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet [production]
10:58 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply [production]
10:56 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply [production]
10:55 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
10:54 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
10:52 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
10:51 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
10:39 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1262 (T413525)', diff saved to https://phabricator.wikimedia.org/P87745 and previous config saved to /var/cache/conftool/dbconfig/20260119-103917-marostegui.json [production]
10:29 <Emperor> restart apus rgws in eqiad [production]
10:29 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P87744 and previous config saved to /var/cache/conftool/dbconfig/20260119-102909-marostegui.json [production]
10:24 <ayounsi@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2003.codfw.wmnet [production]
10:19 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P87743 and previous config saved to /var/cache/conftool/dbconfig/20260119-101901-marostegui.json [production]
10:12 <ayounsi@cumin1003> START - Cookbook sre.hosts.reboot-single for host sretest2003.codfw.wmnet [production]
10:11 <marostegui@cumin1003> dbctl commit (dc=all): 'Depooling db2240 (T413525)', diff saved to https://phabricator.wikimedia.org/P87742 and previous config saved to /var/cache/conftool/dbconfig/20260119-101136-marostegui.json [production]
10:11 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance [production]
10:11 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2248 (T413525)', diff saved to https://phabricator.wikimedia.org/P87741 and previous config saved to /var/cache/conftool/dbconfig/20260119-101111-marostegui.json [production]
10:08 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1262 (T413525)', diff saved to https://phabricator.wikimedia.org/P87740 and previous config saved to /var/cache/conftool/dbconfig/20260119-100852-marostegui.json [production]
10:01 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P87739 and previous config saved to /var/cache/conftool/dbconfig/20260119-100103-marostegui.json [production]