201-250 of 10000 results (33ms)
2022-02-18 ยง
12:08 <cmooney@cumin1001> START - Cookbook sre.dns.netbox [production]
12:08 <cmooney@cumin1001> END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) [production]
12:05 <cmooney@cumin1001> START - Cookbook sre.dns.netbox [production]
11:56 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P21019 and previous config saved to /var/cache/conftool/dbconfig/20220218-115608-kormat.json [production]
11:54 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1017.eqiad.wmnet with OS buster [production]
11:43 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1017.eqiad.wmnet with reason: host reimage [production]
11:41 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1017.eqiad.wmnet with reason: host reimage [production]
11:41 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P21018 and previous config saved to /var/cache/conftool/dbconfig/20220218-114103-kormat.json [production]
11:27 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host ganeti1017.eqiad.wmnet with OS buster [production]
11:25 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21017 and previous config saved to /var/cache/conftool/dbconfig/20220218-112558-kormat.json [production]
11:05 <kormat@cumin1001> dbctl commit (dc=all): 'Depooling db1146:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21016 and previous config saved to /var/cache/conftool/dbconfig/20220218-110506-kormat.json [production]
11:05 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [production]
11:05 <kormat@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance [production]
11:04 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21015 and previous config saved to /var/cache/conftool/dbconfig/20220218-110459-kormat.json [production]
10:50 <moritzm> installing zsh security updates on stretch [production]
10:49 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21014 and previous config saved to /var/cache/conftool/dbconfig/20220218-104954-kormat.json [production]
10:43 <Emperor> truncate swift/server.log.1 to 10G on thanos-be2001 T301657 [production]
10:37 <Emperor> rsyslog-rotate to clear held-open server.log.1 (ms-be[2028-2030,2032,2037-2038,2040,2046-2047,2050-2051,2053-2054,2057,2060,2063,2065].codfw.wmnet,ms-be[1028-1031,1035-1038,1042,1046,1048-1049,1054,1058-1060,1065,1067].eqiad.wmnet,thanos-be2001.codfw.wmnet) T301657 [production]
10:34 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21013 and previous config saved to /var/cache/conftool/dbconfig/20220218-103449-kormat.json [production]
10:20 <godog> truncate /var/log/swift/server.log.1 to 30G due to full root fs - T301657 [production]
10:19 <kormat@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21012 and previous config saved to /var/cache/conftool/dbconfig/20220218-101945-kormat.json [production]
10:01 <kormat@cumin1001> dbctl commit (dc=all): 'Depooling db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21011 and previous config saved to /var/cache/conftool/dbconfig/20220218-100135-kormat.json [production]
10:01 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance [production]
10:01 <kormat@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance [production]
10:00 <kormat> deploying schema change to s2 T300774 [production]
09:35 <moritzm> draining instances off ganeti1009 [production]
09:33 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1022.eqiad.wmnet with OS buster [production]
09:02 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1022.eqiad.wmnet with reason: host reimage [production]
09:01 <jmm@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM testvm2001.codfw.wmnet [production]
08:58 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1022.eqiad.wmnet with reason: host reimage [production]
08:57 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2002.codfw.wmnet [production]
08:54 <jmm@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM testvm2002.codfw.wmnet [production]
08:53 <kart_> Updated cxserver to 2022-02-15-050044-production (T301443) [production]
08:52 <kartik@deploy1002> helmfile [eqiad] DONE helmfile.d/services/cxserver: apply [production]
08:50 <kartik@deploy1002> helmfile [eqiad] START helmfile.d/services/cxserver: apply [production]
08:47 <kartik@deploy1002> helmfile [codfw] DONE helmfile.d/services/cxserver: apply [production]
08:45 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host ganeti1022.eqiad.wmnet with OS buster [production]
08:45 <kartik@deploy1002> helmfile [codfw] START helmfile.d/services/cxserver: apply [production]
08:39 <kartik@deploy1002> helmfile [staging] DONE helmfile.d/services/cxserver: apply [production]
08:39 <kartik@deploy1002> helmfile [staging] START helmfile.d/services/cxserver: apply [production]
08:19 <kevinbazira@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . [production]
08:19 <kevinbazira@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' . [production]
07:57 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
07:57 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
07:57 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
07:57 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
07:42 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
07:42 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
07:41 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
07:41 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]