2022-02-16
ยง
|
15:34 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance |
[production] |
15:34 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance |
[production] |
15:34 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20914 and previous config saved to /var/cache/conftool/dbconfig/20220216-153448-kormat.json |
[production] |
15:25 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20913 and previous config saved to /var/cache/conftool/dbconfig/20220216-152529-marostegui.json |
[production] |
15:19 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20912 and previous config saved to /var/cache/conftool/dbconfig/20220216-151944-kormat.json |
[production] |
15:04 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20911 and previous config saved to /var/cache/conftool/dbconfig/20220216-150439-kormat.json |
[production] |
15:04 |
<jelto@deploy1002> |
helmfile [staging] DONE helmfile.d/services/toolhub: apply |
[production] |
15:03 |
<jelto@deploy1002> |
helmfile [staging] START helmfile.d/services/toolhub: apply |
[production] |
15:02 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:01 |
<jelto@deploy1002> |
helmfile [staging] DONE helmfile.d/services/termbox: apply |
[production] |
15:00 |
<jelto@deploy1002> |
helmfile [staging] START helmfile.d/services/termbox: apply |
[production] |
14:58 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
14:49 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20910 and previous config saved to /var/cache/conftool/dbconfig/20220216-144934-kormat.json |
[production] |
14:47 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Depooling db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20909 and previous config saved to /var/cache/conftool/dbconfig/20220216-144726-kormat.json |
[production] |
14:47 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance |
[production] |
14:47 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance |
[production] |
14:44 |
<hnowlan@cumin1001> |
START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001 |
[production] |
14:35 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance |
[production] |
14:35 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance |
[production] |
14:35 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20908 and previous config saved to /var/cache/conftool/dbconfig/20220216-143535-kormat.json |
[production] |
14:21 |
<moritzm> |
migrate instances off ganeti1017 |
[production] |
14:20 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20907 and previous config saved to /var/cache/conftool/dbconfig/20220216-142030-kormat.json |
[production] |
14:17 |
<sukhe> |
disabled puppet on all doh* hosts except doh3001 |
[production] |
14:17 |
<moritzm> |
failover the ganeti master to ganeti1024 T296721 |
[production] |
14:16 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED |
[production] |
14:16 |
<volans@cumin2002> |
START - Cookbook sre.hosts.provision for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED |
[production] |
14:15 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20906 and previous config saved to /var/cache/conftool/dbconfig/20220216-141546-marostegui.json |
[production] |
14:15 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance |
[production] |
14:15 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance |
[production] |
14:13 |
<mforns@deploy1002> |
Finished deploy [airflow-dags/analytics@8991326]: (no justification provided) (duration: 00m 07s) |
[production] |
14:13 |
<mforns@deploy1002> |
Started deploy [airflow-dags/analytics@8991326]: (no justification provided) |
[production] |
14:05 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20905 and previous config saved to /var/cache/conftool/dbconfig/20220216-140526-kormat.json |
[production] |
13:50 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20903 and previous config saved to /var/cache/conftool/dbconfig/20220216-135021-kormat.json |
[production] |
13:46 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Depooling db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20902 and previous config saved to /var/cache/conftool/dbconfig/20220216-134612-kormat.json |
[production] |
13:46 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance |
[production] |
13:46 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance |
[production] |
13:46 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance |
[production] |
13:46 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance |
[production] |
13:45 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20901 and previous config saved to /var/cache/conftool/dbconfig/20220216-134559-kormat.json |
[production] |
13:30 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20900 and previous config saved to /var/cache/conftool/dbconfig/20220216-133054-kormat.json |
[production] |
13:29 |
<jayme@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
13:29 |
<jayme@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'. |
[production] |
13:29 |
<jayme@deploy1002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
13:28 |
<jayme@deploy1002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'. |
[production] |
13:27 |
<jayme@deploy1002> |
helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
13:27 |
<jayme@deploy1002> |
helmfile [staging-eqiad] START helmfile.d/admin 'apply'. |
[production] |
13:24 |
<jayme@deploy1002> |
helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
13:23 |
<jayme@deploy1002> |
helmfile [staging-codfw] START helmfile.d/admin 'apply'. |
[production] |
13:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1112 (T300775)', diff saved to https://phabricator.wikimedia.org/P20899 and previous config saved to /var/cache/conftool/dbconfig/20220216-132322-marostegui.json |
[production] |
13:23 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |