2022-10-27
ยง
|
16:35 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P36868 and previous config saved to /var/cache/conftool/dbconfig/20221027-163524-ladsgroup.json |
[production] |
16:33 |
<dancy@deploy1002> |
scap failed: CalledProcessError Command 'sudo -u mwbuilder /usr/bin/make -C /srv/mwbuilder/release/make-container-image -f Makefile build-and-push-all-images http_proxy=http://webproxy.eqiad.wmnet:8080 https_proxy=http://webproxy.eqiad.wmnet:8080 GIT_BASE=https://gerrit.wikimedia.org/r/ MW_CONFIG_BRANCH=master workdir_volume=/srv/mediawiki-staging mv_image_name=docker-registry.discovery.wmnet/restric |
[production] |
16:31 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P36867 and previous config saved to /var/cache/conftool/dbconfig/20221027-163109-ladsgroup.json |
[production] |
16:27 |
<dancy@deploy1002> |
Started scap: testing mw-debug |
[production] |
16:24 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2005.codfw.wmnet |
[production] |
16:20 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host testvm2005.codfw.wmnet |
[production] |
16:20 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P36866 and previous config saved to /var/cache/conftool/dbconfig/20221027-162018-ladsgroup.json |
[production] |
16:18 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet |
[production] |
16:16 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P36865 and previous config saved to /var/cache/conftool/dbconfig/20221027-161602-ladsgroup.json |
[production] |
16:15 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage |
[production] |
16:14 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet |
[production] |
16:11 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage |
[production] |
16:05 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T318605)', diff saved to https://phabricator.wikimedia.org/P36864 and previous config saved to /var/cache/conftool/dbconfig/20221027-160511-ladsgroup.json |
[production] |
16:00 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T318950)', diff saved to https://phabricator.wikimedia.org/P36863 and previous config saved to /var/cache/conftool/dbconfig/20221027-160056-ladsgroup.json |
[production] |
15:59 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2167:3311 (T318950)', diff saved to https://phabricator.wikimedia.org/P36862 and previous config saved to /var/cache/conftool/dbconfig/20221027-155946-ladsgroup.json |
[production] |
15:59 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2167.codfw.wmnet with reason: Maintenance |
[production] |
15:59 |
<aikochou@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
15:59 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2167.codfw.wmnet with reason: Maintenance |
[production] |
15:59 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2153 (T318950)', diff saved to https://phabricator.wikimedia.org/P36861 and previous config saved to /var/cache/conftool/dbconfig/20221027-155902-ladsgroup.json |
[production] |
15:55 |
<aborrero@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2003-dev.codfw.wmnet with OS bullseye |
[production] |
15:47 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
15:46 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
15:45 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS buster |
[production] |
15:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P36860 and previous config saved to /var/cache/conftool/dbconfig/20221027-154356-ladsgroup.json |
[production] |
15:42 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4041.ulsfo.wmnet with OS buster |
[production] |
15:31 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2171:3316 (T318605)', diff saved to https://phabricator.wikimedia.org/P36859 and previous config saved to /var/cache/conftool/dbconfig/20221027-153143-ladsgroup.json |
[production] |
15:31 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance |
[production] |
15:31 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance |
[production] |
15:31 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T318605)', diff saved to https://phabricator.wikimedia.org/P36858 and previous config saved to /var/cache/conftool/dbconfig/20221027-153121-ladsgroup.json |
[production] |
15:28 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P36857 and previous config saved to /var/cache/conftool/dbconfig/20221027-152849-ladsgroup.json |
[production] |
15:26 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wcqs2002.codfw.wmnet |
[production] |
15:26 |
<bking@cumin2002> |
START - Cookbook sre.hosts.remove-downtime for wcqs2002.codfw.wmnet |
[production] |
15:26 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on wcqs2003.codfw.wmnet with reason: data reload |
[production] |
15:26 |
<bking@cumin2002> |
START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on wcqs2003.codfw.wmnet with reason: data reload |
[production] |
15:26 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on wcqs2002.codfw.wmnet with reason: data reload |
[production] |
15:25 |
<bking@cumin2002> |
START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on wcqs2002.codfw.wmnet with reason: data reload |
[production] |
15:23 |
<claime> |
Removed silence ProbeDown instance="mwdebug:4444" |
[production] |
15:23 |
<claime> |
k8s-experimental mwdebug service switched to new deployment mw-debug |
[production] |
15:22 |
<claime> |
Unpausing mwdebug k8s deployments |
[production] |
15:19 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage |
[production] |
15:19 |
<cgoubert@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply |
[production] |
15:18 |
<cgoubert@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mw-debug: apply |
[production] |
15:18 |
<cgoubert@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mw-debug: apply |
[production] |
15:18 |
<cgoubert@deploy1002> |
helmfile [codfw] START helmfile.d/services/mw-debug: apply |
[production] |
15:16 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P36856 and previous config saved to /var/cache/conftool/dbconfig/20221027-151615-ladsgroup.json |
[production] |
15:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2175 (T321123)', diff saved to https://phabricator.wikimedia.org/P36855 and previous config saved to /var/cache/conftool/dbconfig/20221027-151604-marostegui.json |
[production] |
15:15 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage |
[production] |
15:13 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2153 (T318950)', diff saved to https://phabricator.wikimedia.org/P36854 and previous config saved to /var/cache/conftool/dbconfig/20221027-151343-ladsgroup.json |
[production] |
15:12 |
<claime> |
Silence ProbeDown instance="mwdebug:4444" for 1h |
[production] |
15:11 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2153 (T318950)', diff saved to https://phabricator.wikimedia.org/P36853 and previous config saved to /var/cache/conftool/dbconfig/20221027-151133-ladsgroup.json |
[production] |