4701-4750 of 10000 results (107ms)
2024-10-15 ยง
11:01 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance [production]
10:57 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2145 (T367781)', diff saved to https://phabricator.wikimedia.org/P69919 and previous config saved to /var/cache/conftool/dbconfig/20241015-105719-arnaudb.json [production]
10:53 <tappof> expand LVs on prometheus instances (k8s-mlserve and k8s-stagin) T377196 [production]
10:53 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db2145 (T367781)', diff saved to https://phabricator.wikimedia.org/P69918 and previous config saved to /var/cache/conftool/dbconfig/20241015-105301-arnaudb.json [production]
10:52 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance [production]
10:52 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2145.codfw.wmnet with reason: Maintenance [production]
10:52 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance [production]
10:52 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2141.codfw.wmnet with reason: Maintenance [production]
10:52 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69917 and previous config saved to /var/cache/conftool/dbconfig/20241015-105213-arnaudb.json [production]
10:38 <brouberol@cumin1002> START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes [production]
10:38 <brouberol@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2002.codfw.wmnet [production]
10:37 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P69915 and previous config saved to /var/cache/conftool/dbconfig/20241015-103706-arnaudb.json [production]
10:34 <brouberol@cumin1002> START - Cookbook sre.hosts.reboot-single for host flink-zk2002.codfw.wmnet [production]
10:30 <brouberol@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2003.codfw.wmnet [production]
10:26 <brouberol@cumin1002> START - Cookbook sre.hosts.reboot-single for host flink-zk2003.codfw.wmnet [production]
10:25 <brouberol@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2001.codfw.wmnet [production]
10:22 <brouberol@cumin1002> START - Cookbook sre.hosts.reboot-single for host flink-zk2001.codfw.wmnet [production]
10:22 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P69914 and previous config saved to /var/cache/conftool/dbconfig/20241015-102159-arnaudb.json [production]
10:21 <brouberol@cumin1002> END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons. [production]
10:14 <brouberol@cumin1002> START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons. [production]
10:11 <brouberol@cumin1002> END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker [production]
10:06 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69913 and previous config saved to /var/cache/conftool/dbconfig/20241015-100652-arnaudb.json [production]
10:04 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db2130 (T367781)', diff saved to https://phabricator.wikimedia.org/P69912 and previous config saved to /var/cache/conftool/dbconfig/20241015-100435-arnaudb.json [production]
10:04 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance [production]
10:04 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2130.codfw.wmnet with reason: Maintenance [production]
10:04 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69911 and previous config saved to /var/cache/conftool/dbconfig/20241015-100413-arnaudb.json [production]
09:57 <brouberol@cumin1002> START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker [production]
09:55 <brouberol@cumin1002> END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:dse-k8s-worker [production]
09:52 <jayme@deploy1003> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
09:49 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P69910 and previous config saved to /var/cache/conftool/dbconfig/20241015-094906-arnaudb.json [production]
09:33 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P69909 and previous config saved to /var/cache/conftool/dbconfig/20241015-093359-arnaudb.json [production]
09:26 <brouberol@cumin1002> START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker [production]
09:18 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69908 and previous config saved to /var/cache/conftool/dbconfig/20241015-091852-arnaudb.json [production]
09:16 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depooling db2116 (T367781)', diff saved to https://phabricator.wikimedia.org/P69907 and previous config saved to /var/cache/conftool/dbconfig/20241015-091635-arnaudb.json [production]
09:16 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance [production]
09:16 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2116.codfw.wmnet with reason: Maintenance [production]
09:16 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance [production]
09:15 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance [production]
09:15 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance [production]
09:15 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance [production]
09:15 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance [production]
09:15 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance [production]
09:15 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1235 (T367781)', diff saved to https://phabricator.wikimedia.org/P69906 and previous config saved to /var/cache/conftool/dbconfig/20241015-091502-arnaudb.json [production]
09:07 <jayme@deploy1003> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
08:59 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P69905 and previous config saved to /var/cache/conftool/dbconfig/20241015-085955-arnaudb.json [production]
08:47 <oblivian@cumin2002> END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: init - oblivian@cumin2002 [production]
08:46 <oblivian@cumin2002> START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: init - oblivian@cumin2002 [production]
08:44 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P69903 and previous config saved to /var/cache/conftool/dbconfig/20241015-084448-arnaudb.json [production]
08:29 <arnaudb@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1235 (T367781)', diff saved to https://phabricator.wikimedia.org/P69902 and previous config saved to /var/cache/conftool/dbconfig/20241015-082941-arnaudb.json [production]
08:27 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on pc1013.eqiad.wmnet with reason: maintenance [production]