6151-6200 of 10000 results (91ms)
2023-05-05 ยง
13:23 <jhathaway@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: New kernel, T335835 [production]
13:20 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P47766 and previous config saved to /var/cache/conftool/dbconfig/20230505-132050-ladsgroup.json [production]
13:14 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host cephosd1004.eqiad.wmnet [production]
13:14 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1003.eqiad.wmnet [production]
13:13 <andrewbogott> rebooting cloudbackup2001.codfw.wmnet, unresponsive [production]
13:05 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P47765 and previous config saved to /var/cache/conftool/dbconfig/20230505-130544-ladsgroup.json [production]
13:05 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host cephosd1003.eqiad.wmnet [production]
13:05 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1002.eqiad.wmnet [production]
12:57 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host cephosd1002.eqiad.wmnet [production]
12:56 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1001.eqiad.wmnet [production]
12:50 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1188 (T335845)', diff saved to https://phabricator.wikimedia.org/P47764 and previous config saved to /var/cache/conftool/dbconfig/20230505-125038-ladsgroup.json [production]
12:46 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host cephosd1001.eqiad.wmnet [production]
12:44 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1188 (T335845)', diff saved to https://phabricator.wikimedia.org/P47763 and previous config saved to /var/cache/conftool/dbconfig/20230505-124412-ladsgroup.json [production]
12:44 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance [production]
12:43 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance [production]
12:43 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182 (T335845)', diff saved to https://phabricator.wikimedia.org/P47762 and previous config saved to /var/cache/conftool/dbconfig/20230505-124349-ladsgroup.json [production]
12:31 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet [production]
12:28 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P47761 and previous config saved to /var/cache/conftool/dbconfig/20230505-122843-ladsgroup.json [production]
12:24 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet [production]
12:13 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P47760 and previous config saved to /var/cache/conftool/dbconfig/20230505-121336-ladsgroup.json [production]
12:06 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1001.eqiad.wmnet [production]
11:59 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host an-mariadb1001.eqiad.wmnet [production]
11:58 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-db1002.eqiad.wmnet [production]
11:58 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182 (T335845)', diff saved to https://phabricator.wikimedia.org/P47759 and previous config saved to /var/cache/conftool/dbconfig/20230505-115830-ladsgroup.json [production]
11:52 <btullis@cumin1001> START - Cookbook sre.hosts.reboot-single for host an-db1002.eqiad.wmnet [production]
11:51 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1182 (T335845)', diff saved to https://phabricator.wikimedia.org/P47758 and previous config saved to /var/cache/conftool/dbconfig/20230505-115126-ladsgroup.json [production]
11:51 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance [production]
11:51 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance [production]
11:26 <ladsgroup@cumin1001> dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P47757 and previous config saved to /var/cache/conftool/dbconfig/20230505-112649-ladsgroup.json [production]
11:26 <ladsgroup@cumin1001> dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P47756 and previous config saved to /var/cache/conftool/dbconfig/20230505-112605-ladsgroup.json [production]
11:11 <ladsgroup@cumin1001> dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P47755 and previous config saved to /var/cache/conftool/dbconfig/20230505-111145-ladsgroup.json [production]
11:11 <ladsgroup@cumin1001> dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P47754 and previous config saved to /var/cache/conftool/dbconfig/20230505-111100-ladsgroup.json [production]
10:56 <ladsgroup@cumin1001> dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P47753 and previous config saved to /var/cache/conftool/dbconfig/20230505-105640-ladsgroup.json [production]
10:55 <ladsgroup@cumin1001> dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P47752 and previous config saved to /var/cache/conftool/dbconfig/20230505-105555-ladsgroup.json [production]
10:41 <ladsgroup@cumin1001> dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P47751 and previous config saved to /var/cache/conftool/dbconfig/20230505-104135-ladsgroup.json [production]
10:41 <moritzm> installing wireshark security updates [production]
10:40 <ladsgroup@cumin1001> dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P47750 and previous config saved to /var/cache/conftool/dbconfig/20230505-104050-ladsgroup.json [production]
09:28 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1170.eqiad.wmnet with reason: Host sad (T336033) [production]
09:28 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1170.eqiad.wmnet with reason: Host sad (T336033) [production]
09:14 <Amir1> power cycled db1170\ [production]
09:10 <marostegui> Failover m2-master from dbproxy1013 to dbproxy1015 [production]
09:08 <hnowlan@deploy1002> Finished deploy [restbase/deploy@8aba801]: deploying to host missing from configs (duration: 01m 22s) [production]
09:06 <hnowlan@deploy1002> Started deploy [restbase/deploy@8aba801]: deploying to host missing from configs [production]
08:58 <XioNoX> deploy CR914772 on all hosts running Bird [production]
08:15 <godog> delete wal and chunks_head from prometheus5002 and prometheus4002 to let prometheus start back up and not crashloop - T309979 [production]
08:07 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host netflow2003.codfw.wmnet with OS bookworm [production]
08:05 <hashar@deploy1002> Finished deploy [integration/docroot@78e6f40]: build: Updating eslint-config-wikimedia to 0.25.0 (duration: 00m 13s) [production]
08:04 <hashar@deploy1002> Started deploy [integration/docroot@78e6f40]: build: Updating eslint-config-wikimedia to 0.25.0 [production]
07:32 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 12:00:00 on db1106.eqiad.wmnet with reason: Maintenance [production]
07:31 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12 days, 12:00:00 on db1106.eqiad.wmnet with reason: Maintenance [production]