2021-02-19
ยง
|
18:38 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1367.eqiad.wmnet |
[production] |
18:36 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1341.eqiad.wmnet |
[production] |
18:35 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw1367.eqiad.wmnet |
[production] |
18:32 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2272.codfw.wmnet |
[production] |
18:30 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw1341.eqiad.wmnet |
[production] |
18:30 |
<mutante> |
mw1367 - powercycled - stuck in reboot |
[production] |
18:29 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2272.codfw.wmnet |
[production] |
18:07 |
<Urbanecm> |
Password reset for User:Kolyma (T274737) |
[production] |
17:36 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1341.eqiad.wmnet with reason: REIMAGE |
[production] |
17:34 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1341.eqiad.wmnet with reason: REIMAGE |
[production] |
17:33 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2272.codfw.wmnet with reason: REIMAGE |
[production] |
17:31 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2272.codfw.wmnet with reason: REIMAGE |
[production] |
17:29 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1367.eqiad.wmnet with reason: REIMAGE |
[production] |
17:27 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1367.eqiad.wmnet with reason: REIMAGE |
[production] |
16:57 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1141.eqiad.wmnet with reason: REIMAGE |
[production] |
16:55 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1140.eqiad.wmnet with reason: REIMAGE |
[production] |
16:55 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1141.eqiad.wmnet with reason: REIMAGE |
[production] |
16:53 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1134.eqiad.wmnet with reason: REIMAGE |
[production] |
16:53 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1140.eqiad.wmnet with reason: REIMAGE |
[production] |
16:51 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1134.eqiad.wmnet with reason: REIMAGE |
[production] |
14:29 |
<mbsantos@deploy1001> |
Finished deploy [tilerator/deploy@937deb5]: (no justification provided) (duration: 00m 15s) |
[production] |
14:28 |
<mbsantos@deploy1001> |
Started deploy [tilerator/deploy@937deb5]: (no justification provided) |
[production] |
14:00 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . |
[production] |
14:00 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . |
[production] |
13:43 |
<akosiaris@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'echostore' for release 'staging' . |
[production] |
13:43 |
<akosiaris@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'echostore' for release 'production' . |
[production] |
13:43 |
<akosiaris@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'echostore' for release 'production' . |
[production] |
13:43 |
<akosiaris@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'echostore' for release 'staging' . |
[production] |
13:43 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . |
[production] |
13:43 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . |
[production] |
13:43 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . |
[production] |
13:43 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . |
[production] |
13:42 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . |
[production] |
13:42 |
<akosiaris@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . |
[production] |
13:41 |
<godog> |
reset-failed ifup@ens13 on prometheus5001 - T273026 |
[production] |
13:39 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5001.eqsin.wmnet |
[production] |
13:31 |
<gehel@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1010.eqiad.wmnet with reason: REIMAGE |
[production] |
13:29 |
<gehel@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1010.eqiad.wmnet with reason: REIMAGE |
[production] |
13:22 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host prometheus5001.eqsin.wmnet |
[production] |
09:27 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop backup cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 |
[production] |
09:16 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.stop-cluster for Hadoop backup cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 |
[production] |
08:40 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1001.eqiad.wmnet |
[production] |
08:34 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-airflow1001.eqiad.wmnet |
[production] |
08:06 |
<godog> |
swift codfw-prod: more weight to ms-be20[58-61] - T269337 |
[production] |
08:04 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1108.eqiad.wmnet |
[production] |
07:47 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-worker1108.eqiad.wmnet |
[production] |
02:26 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1133.eqiad.wmnet with reason: REIMAGE |
[production] |
02:24 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1133.eqiad.wmnet with reason: REIMAGE |
[production] |
01:22 |
<mutante> |
mwmaint2001 back on buster and back in scap dsh groups (if anything pops up you can revert 665175) |
[production] |
01:19 |
<mutante> |
deleting my huge build from puppet-compiler that failed because it made the compiler instance run out of disk to run on * |
[production] |