5151-5200 of 10000 results (113ms)
2025-01-27 ยง
11:34 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P72463 and previous config saved to /var/cache/conftool/dbconfig/20250127-113431-marostegui.json [production]
11:33 <tchin@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply [production]
11:33 <tchin@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply [production]
11:32 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rpki1001.eqiad.wmnet with reason: host reimage [production]
11:27 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on rpki1001.eqiad.wmnet with reason: host reimage [production]
11:25 <aikochou@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' . [production]
11:20 <topranks> installing updated JunOS image on cr1-magru T384774 [production]
11:19 <jmm@cumin2002> START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2003.codfw.wmnet to drbd [production]
11:19 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1200 (T384592)', diff saved to https://phabricator.wikimedia.org/P72462 and previous config saved to /var/cache/conftool/dbconfig/20250127-111924-marostegui.json [production]
11:17 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host rpki1001.eqiad.wmnet with OS bookworm [production]
11:14 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2025.codfw.wmnet [production]
11:14 <jmm@cumin2002> START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet [production]
11:04 <root@cumin1002> END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2185.codfw.wmnet [production]
11:04 <jmm@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host rpki1001.eqiad.wmnet with OS bookworm [production]
10:59 <root@cumin1002> START - Cookbook sre.mysql.upgrade for db2185.codfw.wmnet [production]
10:59 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1200 (T384592)', diff saved to https://phabricator.wikimedia.org/P72461 and previous config saved to /var/cache/conftool/dbconfig/20250127-105944-marostegui.json [production]
10:59 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1200.eqiad.wmnet with reason: Maintenance [production]
10:59 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1185 (T384592)', diff saved to https://phabricator.wikimedia.org/P72460 and previous config saved to /var/cache/conftool/dbconfig/20250127-105922-marostegui.json [production]
10:56 <cmooney@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw1-b[3-4]-magru.mgmt with reason: upgrading JunOS on magru core routers [production]
10:47 <topranks> rebooting cr2-magru to complete upgrade T384774 [production]
10:44 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P72459 and previous config saved to /var/cache/conftool/dbconfig/20250127-104415-marostegui.json [production]
10:43 <vgutierrez> testing pybal 1.15.15 in lvs4010 [production]
10:43 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet [production]
10:43 <root@cumin1002> START - Cookbook sre.hosts.reimage for host db1171.eqiad.wmnet with OS bookworm [production]
10:42 <jmm@cumin2002> START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet [production]
10:42 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1004.wikimedia.org [production]
10:38 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host idp-test1004.wikimedia.org [production]
10:37 <jynus@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1171.eqiad.wmnet with reason: reimage [production]
10:36 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host rpki1001.eqiad.wmnet with OS bookworm [production]
10:36 <jmm@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host rpki1001.eqiad.wmnet with OS bookworm [production]
10:34 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host rpki1001.eqiad.wmnet with OS bookworm [production]
10:34 <fabfur> installing haproxykafka on esams (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1114329) (T378578) [production]
10:33 <jmm@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host rpki1001.eqiad.wmnet with OS bookworm [production]
10:29 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P72458 and previous config saved to /var/cache/conftool/dbconfig/20250127-102908-marostegui.json [production]
10:26 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool es1024 T384820', diff saved to https://phabricator.wikimedia.org/P72457 and previous config saved to /var/cache/conftool/dbconfig/20250127-102657-marostegui.json [production]
10:24 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2004.wikimedia.org [production]
10:20 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host idp-test2004.wikimedia.org [production]
10:20 <topranks> installing updated JunOS image on cr2-magru T384774 [production]
10:16 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore[1007,1009].eqiad.wmnet with reason: Index rebuild + upgrade [production]
10:14 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1185 (T384592)', diff saved to https://phabricator.wikimedia.org/P72456 and previous config saved to /var/cache/conftool/dbconfig/20250127-101401-marostegui.json [production]
10:13 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org [production]
10:11 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore[1007,1009].eqiad.wmnet with reason: Index rebuild + upgrade [production]
10:09 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org [production]
10:07 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2020.codfw.wmnet [production]
10:04 <cmooney@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on cr[1-2]-magru,cr[1-2]-magru IPv6 with reason: upgrading JunOS on magru core routers [production]
09:54 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1185 (T384592)', diff saved to https://phabricator.wikimedia.org/P72455 and previous config saved to /var/cache/conftool/dbconfig/20250127-095416-marostegui.json [production]
09:54 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1185.eqiad.wmnet with reason: Maintenance [production]
09:53 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1161 (T384592)', diff saved to https://phabricator.wikimedia.org/P72454 and previous config saved to /var/cache/conftool/dbconfig/20250127-095354-marostegui.json [production]
09:47 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host rpki1001.eqiad.wmnet with OS bookworm [production]
09:47 <moritzm> reimaging rpki1001 to bookworm [production]