1101-1150 of 10000 results (32ms)
2021-02-19 ยง
18:32 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2272.codfw.wmnet [production]
18:30 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1341.eqiad.wmnet [production]
18:30 <mutante> mw1367 - powercycled - stuck in reboot [production]
18:29 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2272.codfw.wmnet [production]
18:07 <Urbanecm> Password reset for User:Kolyma (T274737) [production]
17:36 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1341.eqiad.wmnet with reason: REIMAGE [production]
17:34 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1341.eqiad.wmnet with reason: REIMAGE [production]
17:33 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2272.codfw.wmnet with reason: REIMAGE [production]
17:31 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2272.codfw.wmnet with reason: REIMAGE [production]
17:29 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1367.eqiad.wmnet with reason: REIMAGE [production]
17:27 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1367.eqiad.wmnet with reason: REIMAGE [production]
16:57 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1141.eqiad.wmnet with reason: REIMAGE [production]
16:55 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1140.eqiad.wmnet with reason: REIMAGE [production]
16:55 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1141.eqiad.wmnet with reason: REIMAGE [production]
16:53 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1134.eqiad.wmnet with reason: REIMAGE [production]
16:53 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1140.eqiad.wmnet with reason: REIMAGE [production]
16:51 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1134.eqiad.wmnet with reason: REIMAGE [production]
14:29 <mbsantos@deploy1001> Finished deploy [tilerator/deploy@937deb5]: (no justification provided) (duration: 00m 15s) [production]
14:28 <mbsantos@deploy1001> Started deploy [tilerator/deploy@937deb5]: (no justification provided) [production]
14:00 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . [production]
14:00 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . [production]
13:43 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'echostore' for release 'staging' . [production]
13:43 <akosiaris@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'echostore' for release 'production' . [production]
13:43 <akosiaris@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'echostore' for release 'production' . [production]
13:43 <akosiaris@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'echostore' for release 'staging' . [production]
13:43 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . [production]
13:43 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . [production]
13:43 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . [production]
13:43 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . [production]
13:42 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'staging' . [production]
13:42 <akosiaris@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'echostore' for release 'production' . [production]
13:41 <godog> reset-failed ifup@ens13 on prometheus5001 - T273026 [production]
13:39 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5001.eqsin.wmnet [production]
13:31 <gehel@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1010.eqiad.wmnet with reason: REIMAGE [production]
13:29 <gehel@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1010.eqiad.wmnet with reason: REIMAGE [production]
13:22 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host prometheus5001.eqsin.wmnet [production]
09:27 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop backup cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 [production]
09:16 <elukey@cumin1001> START - Cookbook sre.hadoop.stop-cluster for Hadoop backup cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 [production]
08:40 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1001.eqiad.wmnet [production]
08:34 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host an-airflow1001.eqiad.wmnet [production]
08:06 <godog> swift codfw-prod: more weight to ms-be20[58-61] - T269337 [production]
08:04 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1108.eqiad.wmnet [production]
07:47 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host an-worker1108.eqiad.wmnet [production]
02:26 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1133.eqiad.wmnet with reason: REIMAGE [production]
02:24 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1133.eqiad.wmnet with reason: REIMAGE [production]
01:22 <mutante> mwmaint2001 back on buster and back in scap dsh groups (if anything pops up you can revert 665175) [production]
01:19 <mutante> deleting my huge build from puppet-compiler that failed because it made the compiler instance run out of disk to run on * [production]
01:03 <urbanecm@deploy1001> Synchronized php-1.36.0-wmf.30/includes/ProtectionForm.php: d305308a5d46a3f86bf0b211e8a733c0a951ddc1: field descriptors in HTMLForm must have keys (T275018; T274980) (duration: 01m 08s) [production]
01:02 <urbanecm@deploy1001> Synchronized php-1.36.0-wmf.31/includes/ProtectionForm.php: 2487c253b090d93daf85adae8ceb9d255cbf4ff2: field descriptors in HTMLForm must have keys (T275018; T274980) (duration: 01m 10s) [production]
00:54 <mutante> mwmaint2001 - back from reimage - scap pull [production]