301-350 of 10000 results (66ms)
2024-05-02 ยง
10:35 <jmm@cumin2002> START - Cookbook sre.dns.netbox [production]
10:25 <marostegui@cumin1002> dbctl commit (dc=all): 'db2152 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P61691 and previous config saved to /var/cache/conftool/dbconfig/20240502-102518-root.json [production]
10:22 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db2213.codfw.wmnet [production]
10:20 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61690 and previous config saved to /var/cache/conftool/dbconfig/20240502-102053-marostegui.json [production]
10:20 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet [production]
10:15 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2211.codfw.wmnet [production]
10:11 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet [production]
10:10 <marostegui@cumin1002> dbctl commit (dc=all): 'db2152 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61689 and previous config saved to /var/cache/conftool/dbconfig/20240502-101012-root.json [production]
10:05 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P61688 and previous config saved to /var/cache/conftool/dbconfig/20240502-100546-marostegui.json [production]
10:00 <btullis@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cephosd1005.eqiad.wmnet with OS bookworm [production]
09:58 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db2211.codfw.wmnet [production]
09:58 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2192.codfw.wmnet [production]
09:55 <marostegui@cumin1002> dbctl commit (dc=all): 'db2152 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61687 and previous config saved to /var/cache/conftool/dbconfig/20240502-095506-root.json [production]
09:54 <moritzm> installing util-linux security updates [production]
09:50 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61686 and previous config saved to /var/cache/conftool/dbconfig/20240502-095038-marostegui.json [production]
09:50 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db2192.codfw.wmnet [production]
09:42 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2178.codfw.wmnet [production]
09:40 <marostegui@cumin1002> dbctl commit (dc=all): 'db2152 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61685 and previous config saved to /var/cache/conftool/dbconfig/20240502-094000-root.json [production]
09:38 <jayme@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2382.codfw.wmnet with reason: Degraded RAID/storage controller issues [production]
09:38 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2179 (T361627)', diff saved to https://phabricator.wikimedia.org/P61684 and previous config saved to /var/cache/conftool/dbconfig/20240502-093827-marostegui.json [production]
09:38 <jayme@cumin1002> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2382.codfw.wmnet with reason: Degraded RAID/storage controller issues [production]
09:38 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance [production]
09:38 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2179.codfw.wmnet with reason: Maintenance [production]
09:38 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61683 and previous config saved to /var/cache/conftool/dbconfig/20240502-093803-marostegui.json [production]
09:35 <btullis@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cephosd1005.eqiad.wmnet with reason: host reimage [production]
09:32 <btullis@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on cephosd1005.eqiad.wmnet with reason: host reimage [production]
09:29 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2152.codfw.wmnet with OS bookworm [production]
09:26 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db2178.codfw.wmnet [production]
09:24 <marostegui@cumin1002> dbctl commit (dc=all): 'db2152 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61682 and previous config saved to /var/cache/conftool/dbconfig/20240502-092454-root.json [production]
09:22 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P61681 and previous config saved to /var/cache/conftool/dbconfig/20240502-092256-marostegui.json [production]
09:18 <hnowlan> depooling 5 appservers in advance of migrating them to k8s workers [production]
09:18 <stevemunene@deploy1002> helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main [production]
09:13 <stevemunene@deploy1002> helmfile [eqiad] START helmfile.d/services/datahub: sync on main [production]
09:13 <stevemunene@deploy1002> helmfile [codfw] DONE helmfile.d/services/datahub: sync on main [production]
09:12 <btullis@cumin1002> START - Cookbook sre.hosts.reimage for host cephosd1005.eqiad.wmnet with OS bookworm [production]
09:10 <btullis@cumin1002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cephosd1005.eqiad.wmnet with OS bookworm [production]
09:09 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2152.codfw.wmnet with reason: host reimage [production]
09:08 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2171.codfw.wmnet [production]
09:07 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P61680 and previous config saved to /var/cache/conftool/dbconfig/20240502-090748-marostegui.json [production]
09:06 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2152.codfw.wmnet with reason: host reimage [production]
09:03 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance [production]
09:02 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance [production]
09:02 <stevemunene@deploy1002> helmfile [codfw] START helmfile.d/services/datahub: sync on main [production]
08:59 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db2171.codfw.wmnet [production]
08:52 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61679 and previous config saved to /var/cache/conftool/dbconfig/20240502-085241-marostegui.json [production]
08:50 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2157.codfw.wmnet [production]
08:49 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db2152.codfw.wmnet with OS bookworm [production]
08:40 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2172 (T361627)', diff saved to https://phabricator.wikimedia.org/P61677 and previous config saved to /var/cache/conftool/dbconfig/20240502-084041-marostegui.json [production]
08:40 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance [production]
08:40 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2172.codfw.wmnet with reason: Maintenance [production]