2024-10-15
ยง
|
10:52 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance |
[production] |
10:52 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69917 and previous config saved to /var/cache/conftool/dbconfig/20241015-105213-arnaudb.json |
[production] |
10:38 |
<brouberol@cumin1002> |
START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes |
[production] |
10:38 |
<brouberol@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2002.codfw.wmnet |
[production] |
10:37 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P69915 and previous config saved to /var/cache/conftool/dbconfig/20241015-103706-arnaudb.json |
[production] |
10:36 |
<aborrero@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch |
[admin] |
10:36 |
<aborrero@cloudcumin1001> |
START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch |
[admin] |
10:34 |
<brouberol@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host flink-zk2002.codfw.wmnet |
[production] |
10:33 |
<aborrero@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch |
[admin] |
10:32 |
<aborrero@cloudcumin1001> |
START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch |
[admin] |
10:30 |
<brouberol@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2003.codfw.wmnet |
[production] |
10:29 |
<aborrero@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch |
[admin] |
10:28 |
<aborrero@cloudcumin1001> |
START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch |
[admin] |
10:26 |
<brouberol@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host flink-zk2003.codfw.wmnet |
[production] |
10:25 |
<brouberol@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2001.codfw.wmnet |
[production] |
10:22 |
<aborrero@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch |
[admin] |
10:22 |
<brouberol@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host flink-zk2001.codfw.wmnet |
[production] |
10:22 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P69914 and previous config saved to /var/cache/conftool/dbconfig/20241015-102159-arnaudb.json |
[production] |
10:21 |
<brouberol@cumin1002> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons. |
[production] |
10:17 |
<aborrero@cloudcumin1001> |
START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch |
[admin] |
10:14 |
<brouberol@cumin1002> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons. |
[production] |
10:11 |
<brouberol@cumin1002> |
END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker |
[production] |
10:06 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69913 and previous config saved to /var/cache/conftool/dbconfig/20241015-100652-arnaudb.json |
[production] |
10:04 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depooling db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69912 and previous config saved to /var/cache/conftool/dbconfig/20241015-100435-arnaudb.json |
[production] |
10:04 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance |
[production] |
10:04 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance |
[production] |
10:04 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69911 and previous config saved to /var/cache/conftool/dbconfig/20241015-100413-arnaudb.json |
[production] |
09:57 |
<brouberol@cumin1002> |
START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker |
[production] |
09:55 |
<brouberol@cumin1002> |
END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:dse-k8s-worker |
[production] |
09:52 |
<jayme@deploy1003> |
helmfile [codfw] DONE helmfile.d/admin 'apply'. |
[production] |
09:49 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P69910 and previous config saved to /var/cache/conftool/dbconfig/20241015-094906-arnaudb.json |
[production] |
09:33 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P69909 and previous config saved to /var/cache/conftool/dbconfig/20241015-093359-arnaudb.json |
[production] |
09:26 |
<brouberol@cumin1002> |
START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker |
[production] |
09:18 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69908 and previous config saved to /var/cache/conftool/dbconfig/20241015-091852-arnaudb.json |
[production] |
09:16 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depooling db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69907 and previous config saved to /var/cache/conftool/dbconfig/20241015-091635-arnaudb.json |
[production] |
09:16 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance |
[production] |
09:16 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance |
[production] |
09:16 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance |
[production] |
09:15 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance |
[production] |
09:15 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance |
[production] |
09:15 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance |
[production] |
09:15 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance |
[production] |
09:15 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance |
[production] |
09:15 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1235 (T367781)', diff saved to https://phabricator.wikimedia.org/P69906 and previous config saved to /var/cache/conftool/dbconfig/20241015-091502-arnaudb.json |
[production] |
09:07 |
<jayme@deploy1003> |
helmfile [codfw] START helmfile.d/admin 'apply'. |
[production] |
08:59 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P69905 and previous config saved to /var/cache/conftool/dbconfig/20241015-085955-arnaudb.json |
[production] |
08:47 |
<oblivian@cumin2002> |
END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: init - oblivian@cumin2002 |
[production] |
08:46 |
<oblivian@cumin2002> |
START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: init - oblivian@cumin2002 |
[production] |
08:45 |
<wmbot~bsadowski1@tools-bastion-13> |
Restarted StewardBot/SULWatcher because of a connection loss |
[tools.stewardbots] |
08:44 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P69903 and previous config saved to /var/cache/conftool/dbconfig/20241015-084448-arnaudb.json |
[production] |