2022-03-30
§
|
00:10 |
<catrope@deploy1002> |
scap failed: RuntimeError Scap failed!: 8/9 canaries failed their endpoint checks(https://en.wikipedia.org). WARNING: canaries have not been rolled back. (duration: 00m 28s) |
[production] |
00:10 |
<catrope@deploy1002> |
Scap failed!: 8/9 canaries failed their endpoint checks(https://en.wikipedia.org). WARNING: canaries have not been rolled back. |
[production] |
00:10 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P23652 and previous config saved to /var/cache/conftool/dbconfig/20220330-001010-ladsgroup.json |
[production] |
00:09 |
<catrope@deploy1002> |
Started scap: Update Kashmiri namespace names (T304790) |
[production] |
00:07 |
<catrope@deploy1002> |
scap failed: RuntimeError Scap failed!: 6/9 canaries failed their endpoint checks(https://en.wikipedia.org). WARNING: canaries have not been rolled back. (duration: 04m 32s) |
[production] |
00:07 |
<catrope@deploy1002> |
Scap failed!: 6/9 canaries failed their endpoint checks(https://en.wikipedia.org). WARNING: canaries have not been rolled back. |
[production] |
00:02 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
00:02 |
<catrope@deploy1002> |
Started scap: Update Kashmiri namespace names (T304790) |
[production] |
00:02 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
00:02 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
00:01 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
00:00 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1129 (T298565)', diff saved to https://phabricator.wikimedia.org/P23651 and previous config saved to /var/cache/conftool/dbconfig/20220330-000019-ladsgroup.json |
[production] |
00:00 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance |
[production] |
00:00 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance |
[production] |
00:00 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23650 and previous config saved to /var/cache/conftool/dbconfig/20220330-000011-ladsgroup.json |
[production] |
2022-03-29
§
|
23:55 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P23649 and previous config saved to /var/cache/conftool/dbconfig/20220329-235505-ladsgroup.json |
[production] |
23:45 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P23648 and previous config saved to /var/cache/conftool/dbconfig/20220329-234506-ladsgroup.json |
[production] |
23:41 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
23:40 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
23:40 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
23:40 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298565)', diff saved to https://phabricator.wikimedia.org/P23647 and previous config saved to /var/cache/conftool/dbconfig/20220329-234000-ladsgroup.json |
[production] |
23:39 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
23:30 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P23646 and previous config saved to /var/cache/conftool/dbconfig/20220329-233001-ladsgroup.json |
[production] |
23:14 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23645 and previous config saved to /var/cache/conftool/dbconfig/20220329-231456-ladsgroup.json |
[production] |
23:12 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23644 and previous config saved to /var/cache/conftool/dbconfig/20220329-231248-ladsgroup.json |
[production] |
23:12 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance |
[production] |
23:12 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1156 (T298565)', diff saved to https://phabricator.wikimedia.org/P23643 and previous config saved to /var/cache/conftool/dbconfig/20220329-231205-ladsgroup.json |
[production] |
22:57 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P23642 and previous config saved to /var/cache/conftool/dbconfig/20220329-225700-ladsgroup.json |
[production] |
22:50 |
<mutante> |
cumin1001 - systemctl start httpbb_hourly_appserver fixed Icinga alert after gerrit:774981 T205361 |
[production] |
22:46 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1101:3317 (T298565)', diff saved to https://phabricator.wikimedia.org/P23641 and previous config saved to /var/cache/conftool/dbconfig/20220329-224652-ladsgroup.json |
[production] |
22:46 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance |
[production] |
22:46 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance |
[production] |
22:46 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298565)', diff saved to https://phabricator.wikimedia.org/P23640 and previous config saved to /var/cache/conftool/dbconfig/20220329-224644-ladsgroup.json |
[production] |
22:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P23639 and previous config saved to /var/cache/conftool/dbconfig/20220329-224155-ladsgroup.json |
[production] |
22:39 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on 8 hosts with reason: Maintenance |
[production] |
22:39 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 16:00:00 on 8 hosts with reason: Maintenance |
[production] |
22:39 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance |
[production] |
22:39 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance |
[production] |
22:38 |
<mutante> |
mwdebug2001 - rebooting |
[production] |
22:36 |
<mutante> |
mwdebug2002 - rebooting |
[production] |
22:31 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P23638 and previous config saved to /var/cache/conftool/dbconfig/20220329-223139-ladsgroup.json |
[production] |