2021-01-28
§
|
16:45 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hadoop.change-distro-from-cdh-clients (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
16:44 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.change-distro-from-cdh-clients for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
16:44 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.change-distro-from-cdh-clients (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
16:44 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.change-distro-from-cdh-clients for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
16:41 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.change-distro-from-cdh-clients (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
16:24 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.change-distro-from-cdh-clients for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
16:19 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.change-distro-from-cdh-clients (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
16:17 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.change-distro-from-cdh-clients for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
16:06 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.change-distro-from-cdh (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
15:55 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.change-distro-from-cdh for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 |
[production] |
15:54 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 |
[production] |
15:49 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.stop-cluster for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 |
[production] |
11:30 |
<elukey> |
disable nginx proxy buffering on archiva.wikimedia.org for a perf test - T252767 |
[production] |
07:25 |
<elukey> |
powercycle cp1087 (after depooling it) |
[production] |
07:24 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=cp1087.eqiad.wmnet |
[production] |
2021-01-27
§
|
19:19 |
<elukey> |
reboot an-launcher1002 for kernel upgrades |
[production] |
16:54 |
<elukey@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams-internal' for release 'main' . |
[production] |
16:40 |
<elukey@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'eventstreams-internal' for release 'main' . |
[production] |
15:42 |
<elukey> |
umount /var/hadoop/data/r on an-worker1099 and restart hadoop daemons - T273034 |
[production] |
10:36 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2004.codfw.wmnet |
[production] |
10:23 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host schema2004.codfw.wmnet |
[production] |
10:23 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2003.codfw.wmnet |
[production] |
10:18 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host schema2003.codfw.wmnet |
[production] |
10:17 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1004.eqiad.wmnet |
[production] |
10:15 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host schema1004.eqiad.wmnet |
[production] |
10:14 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1003.eqiad.wmnet |
[production] |
10:12 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host schema1003.eqiad.wmnet |
[production] |
10:05 |
<elukey> |
reboot matomo1002 for kernel upgrades |
[production] |
07:26 |
<elukey> |
powercycle analytics1073 - kernel soft lock up bug registered, os needs a reboot |
[production] |
2021-01-26
§
|
11:47 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
11:44 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
09:37 |
<elukey> |
reboot dbstore1005 for kernel upgrades |
[production] |
09:28 |
<elukey> |
reboot dbstore1003 for kernel upgrades |
[production] |
09:14 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) |
[production] |
09:14 |
<elukey> |
reboot dbstore1004 for kernel upgrades |
[production] |
09:06 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:38 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1119,1131].eqiad.wmnet |
[production] |
08:36 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1119,1131].eqiad.wmnet |
[production] |