101-150 of 10000 results (39ms)
2021-02-11 ยง
16:33 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1368.eqiad.wmnet with reason: REIMAGE [production]
16:32 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1375.eqiad.wmnet with reason: REIMAGE [production]
16:30 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1376.eqiad.wmnet with reason: REIMAGE [production]
16:30 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1375.eqiad.wmnet with reason: REIMAGE [production]
16:28 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1368.eqiad.wmnet with reason: REIMAGE [production]
16:28 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1376.eqiad.wmnet with reason: REIMAGE [production]
16:24 <ejegg> updated payments-wiki from a232fc3438 to 4b7b195c8a [production]
16:13 <kormat@cumin1001> dbctl commit (dc=all): 'Pool db1163 at 1%, again T258361', diff saved to https://phabricator.wikimedia.org/P14323 and previous config saved to /var/cache/conftool/dbconfig/20210211-161308-kormat.json [production]
15:52 <jynus> deploying fixed grants to db1163 [production]
15:50 <gehel> ban elastic2054 from shard allocation - T274555 [production]
15:49 <jynus@cumin1001> dbctl commit (dc=all): 'Depool 1163', diff saved to https://phabricator.wikimedia.org/P14321 and previous config saved to /var/cache/conftool/dbconfig/20210211-154902-jynus.json [production]
15:47 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org [production]
15:46 <gehel> depooling elastic2054 - T274555 [production]
15:45 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org [production]
15:45 <kormat@cumin1001> dbctl commit (dc=all): 'Pool db1163 at 1% T258361', diff saved to https://phabricator.wikimedia.org/P14320 and previous config saved to /var/cache/conftool/dbconfig/20210211-154501-kormat.json [production]
15:39 <gehel> powercycle elastic2054 - T274555 [production]
15:39 <gehel> powercycle elastic2054 [production]
14:44 <kormat@cumin1001> dbctl commit (dc=all): 'Add db1163 to s1 T258361', diff saved to https://phabricator.wikimedia.org/P14318 and previous config saved to /var/cache/conftool/dbconfig/20210211-144445-kormat.json [production]
14:24 <mholloway-shell@deploy1001> Synchronized wmf-config/InitialiseSettings.php: EventStreams: Update sampling config syntax for test.instrumentation.sampled (duration: 01m 08s) [production]
14:11 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2001.wikimedia.org [production]
14:02 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host netmon2001.wikimedia.org [production]
13:53 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1003.eqiad.wmnet [production]
13:48 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe1003.eqiad.wmnet [production]
13:48 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1002.eqiad.wmnet [production]
13:41 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe1002.eqiad.wmnet [production]
13:28 <godog> test grafana 7.4.1 upgrade on grafana2001 - T263747 [production]
13:27 <moritzm> re-adding ganeti5002 to the eqsin Ganeti cluster following mainboard replacement/reinstall T261130 [production]
13:22 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1001.eqiad.wmnet [production]
13:16 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe1001.eqiad.wmnet [production]
13:08 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2003.codfw.wmnet [production]
13:04 <hnowlan@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
13:03 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe2003.codfw.wmnet [production]
13:00 <hnowlan@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
12:57 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
12:53 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2002.codfw.wmnet [production]
12:45 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe2002.codfw.wmnet [production]
12:41 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet [production]
12:40 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2001.codfw.wmnet [production]
12:40 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: d2b1df105afd9f9c9c047ae9c0a434674f43d505: Changing frwiktionary wmgBabelMainCategory (T274137) (duration: 01m 08s) [production]
12:37 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet [production]
12:35 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe2001.codfw.wmnet [production]
12:18 <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:662967|wikidata: post edit constraint jobs on 50% of edits (T204031)]] (up from 40%) (duration: 01m 08s) [production]
12:15 <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:662970|wikidata: add Dagbani to wmgExtraLanguageNames (T272242)]] (duration: 01m 29s) [production]
12:06 <jynus> restart-failed systemd on cumin1001 after s5 eqiad snapshot failed [production]
11:49 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor2002.codfw.wmnet [production]
11:45 <mvolz@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' . [production]
11:41 <jynus@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: REIMAGE [production]
11:40 <mvolz@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' . [production]
11:39 <jynus@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: REIMAGE [production]
11:39 <jiji@cumin1001> START - Cookbook sre.hosts.reboot-single for host thumbor2002.codfw.wmnet [production]