2022-02-10
ยง
|
22:39 |
<mutante> |
etherpad - succesfully switched to etherpad1003 (bullseye) and etherpad 1.8.16 - on second attempt after making it listen on IPv6 to work behind envoy (T300568) - https://gerrit.wikimedia.org/r/c/operations/puppet/+/761727/ |
[production] |
22:34 |
<bblack@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
22:31 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance |
[production] |
22:31 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance |
[production] |
22:28 |
<bblack@cumin1001> |
END (ERROR) - Cookbook sre.dns.netbox (exit_code=97) |
[production] |
22:27 |
<bblack@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1013.eqiad.wmnet with OS buster |
[production] |
22:26 |
<bblack@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
22:24 |
<mutante> |
etherpad - one more short downtime for maintenance - downtimed in alertmanager and icinga |
[production] |
22:04 |
<bblack@cumin1001> |
START - Cookbook sre.hosts.reimage for host lvs1013.eqiad.wmnet with OS buster |
[production] |
21:54 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
21:53 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
21:53 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20589 and previous config saved to /var/cache/conftool/dbconfig/20220210-215354-ladsgroup.json |
[production] |
21:38 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20588 and previous config saved to /var/cache/conftool/dbconfig/20220210-213849-ladsgroup.json |
[production] |
21:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20587 and previous config saved to /var/cache/conftool/dbconfig/20220210-212344-ladsgroup.json |
[production] |
21:16 |
<bblack> |
cr1-eqiad - manual config, static fallback for high-traffic1 to lvs1017 |
[production] |
21:08 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20586 and previous config saved to /var/cache/conftool/dbconfig/20220210-210839-ladsgroup.json |
[production] |
21:08 |
<bblack> |
lvs1017 - bringing pybal online with real routing, flips high-traffic (text-cluster) traffic from lvs1020 -> lvs1017 |
[production] |
20:48 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20585 and previous config saved to /var/cache/conftool/dbconfig/20220210-204831-ladsgroup.json |
[production] |
20:48 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
20:48 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
20:48 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance |
[production] |
20:48 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance |
[production] |
20:48 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20584 and previous config saved to /var/cache/conftool/dbconfig/20220210-204818-ladsgroup.json |
[production] |
20:33 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20583 and previous config saved to /var/cache/conftool/dbconfig/20220210-203313-ladsgroup.json |
[production] |
20:18 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20582 and previous config saved to /var/cache/conftool/dbconfig/20220210-201808-ladsgroup.json |
[production] |
20:17 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
20:15 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
20:15 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
20:14 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
20:08 |
<jhuneidi@deploy1002> |
rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.21 refs T300197 |
[production] |
20:03 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20581 and previous config saved to /var/cache/conftool/dbconfig/20220210-200304-ladsgroup.json |
[production] |
19:45 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20580 and previous config saved to /var/cache/conftool/dbconfig/20220210-194518-ladsgroup.json |
[production] |
19:45 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance |
[production] |
19:45 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance |
[production] |
19:45 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20579 and previous config saved to /var/cache/conftool/dbconfig/20220210-194510-ladsgroup.json |
[production] |
19:30 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20578 and previous config saved to /var/cache/conftool/dbconfig/20220210-193005-ladsgroup.json |
[production] |
19:25 |
<bblack> |
lvs1017 reboot again for clean network config - T301142 |
[production] |
19:23 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
19:19 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
19:19 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
19:18 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
19:15 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20577 and previous config saved to /var/cache/conftool/dbconfig/20220210-191501-ladsgroup.json |
[production] |
19:13 |
<jgiannelos@deploy1002> |
Finished deploy [kartotherian/deploy@828a428] (eqiad): Configure geoshapes postgres max conns (duration: 01m 29s) |
[production] |
19:13 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
19:13 |
<urbanecm@deploy1002> |
Synchronized wmf-config/flaggedrevs.php: 72f3b31: Migrate $wmfStandardAutoPromote to $wmgStandardAutoPromote (T45956) (duration: 00m 49s) |
[production] |
19:12 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
19:12 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn |
[production] |
19:12 |
<jgiannelos@deploy1002> |
Started deploy [kartotherian/deploy@828a428] (eqiad): Configure geoshapes postgres max conns |
[production] |
19:11 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
19:11 |
<bblack> |
lvs1017 rebooting for sanity-check after prod config - T301142 |
[production] |