1151-1200 of 10000 results (32ms)
2021-01-19 ยง
16:51 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on mw2314.codfw.wmnet with reason: new install on buster [production]
16:50 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2314.codfw.wmnet with reason: REIMAGE [production]
16:48 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2313.codfw.wmnet with reason: REIMAGE [production]
16:47 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'test' . [production]
16:47 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
16:47 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'canary' . [production]
16:47 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2315.codfw.wmnet with reason: REIMAGE [production]
16:46 <brennen> 1.36.0-wmf.27 was branched at fbb516d8e33924c6cb66c93bb6d42907558c31f3 for T271341 [production]
16:45 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2312.codfw.wmnet with reason: REIMAGE [production]
16:45 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2315.codfw.wmnet with reason: REIMAGE [production]
16:45 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2314.codfw.wmnet with reason: REIMAGE [production]
16:43 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2313.codfw.wmnet with reason: REIMAGE [production]
16:43 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2312.codfw.wmnet with reason: REIMAGE [production]
16:43 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'canary' . [production]
16:43 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'test' . [production]
16:43 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
16:41 <jmm@cumin2001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ms-be1046.eqiad.wmnet [production]
16:39 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'test' . [production]
16:39 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
16:39 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'canary' . [production]
16:36 <marostegui@cumin1001> dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13838 and previous config saved to /var/cache/conftool/dbconfig/20210119-163637-root.json [production]
16:30 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'test' . [production]
16:30 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'canary' . [production]
16:30 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
16:23 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
16:23 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'test' . [production]
16:23 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'canary' . [production]
16:22 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'canary' . [production]
16:21 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'test' . [production]
16:21 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
16:21 <marostegui@cumin1001> dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13837 and previous config saved to /var/cache/conftool/dbconfig/20210119-162134-root.json [production]
16:14 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.change-distro-from-cdh (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 [production]
16:07 <moritzm> powercycling ms-be1046, stuck during boot [production]
16:06 <marostegui@cumin1001> dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13836 and previous config saved to /var/cache/conftool/dbconfig/20210119-160630-root.json [production]
15:58 <elukey@cumin1001> START - Cookbook sre.hadoop.change-distro-from-cdh for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001 [production]
15:58 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 [production]
15:51 <elukey@cumin1001> START - Cookbook sre.hadoop.stop-cluster for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001 [production]
15:51 <marostegui@cumin1001> dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13835 and previous config saved to /var/cache/conftool/dbconfig/20210119-155127-root.json [production]
15:47 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host ms-be1046.eqiad.wmnet [production]
15:46 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1045.eqiad.wmnet [production]
15:45 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'canary' . [production]
15:45 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
15:45 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'test' . [production]
15:43 <jmm@cumin2001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host cuminunpriv1001.eqiad.wmnet [production]
15:40 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host ms-be1045.eqiad.wmnet [production]
15:37 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1044.eqiad.wmnet [production]
15:29 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host ms-be1044.eqiad.wmnet [production]
15:28 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1043.eqiad.wmnet [production]
15:26 <jmm@cumin2001> START - Cookbook sre.ganeti.makevm for new host cuminunpriv1001.eqiad.wmnet [production]
15:23 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'canary' . [production]