2021-10-13
ยง
|
16:57 |
<mutante> |
stat1008 - short on disk space, mostly used in /tmp, high CPU usage by R proccess, sent a message about it to all shell users via wall |
[production] |
16:50 |
<mutante> |
stat1008 - apt-get clean - freed 1.3 GB disk space - was alerting in Icinga because / was 97% full |
[production] |
16:37 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
16:37 |
<volans@cumin2002> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
16:23 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
16:23 |
<volans@cumin2002> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:29 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:28 |
<volans@cumin2002> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:26 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:26 |
<volans@cumin2002> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:16 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet |
[production] |
15:13 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:13 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:12 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:12 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:09 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet |
[production] |
15:04 |
<jgiannelos@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . |
[production] |
15:03 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:03 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:01 |
<jgiannelos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . |
[production] |
15:01 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
15:01 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:59 |
<jgiannelos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . |
[production] |
14:59 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:59 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:57 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:56 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:56 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:56 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:54 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:54 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:52 |
<ema> |
repool cp4021, further testing can be performed on sretest1001 T201317 |
[production] |
14:51 |
<volans> |
restarting ircecho.service on alert1001 to get back icinga-wm without the underscore |
[production] |
14:50 |
<elukey> |
restart pybal on lvs1015 (low-traffic primary) to pick up new config for inference.discovery.wmnet - T289835 |
[production] |
14:48 |
<moritzm> |
reverted to clean package state on deneb |
[production] |
14:44 |
<elukey@puppetmaster1001> |
conftool action : ge; selector: cluster=ml_serve,service=inference |
[production] |
14:36 |
<elukey> |
restart pybal on lvs1016 (low-traffic secondary) to pick up new config for inference.discovery.wmnet - T289835 |
[production] |
14:27 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:27 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:25 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:25 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:21 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:21 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:20 |
<moritzm> |
temporarily downgrade sphinx packages on deneb to 1.7.9-1~bpo9+1 to build a Ganeti 2.16 stretch backport with delicate toolchain needs |
[production] |
14:13 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:13 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:10 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:10 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:10 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:10 |
<jbond@cumin1001> |
START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet |
[production] |