351-400 of 10000 results (23ms)
2021-02-11 ยง
15:49 <jynus@cumin1001> dbctl commit (dc=all): 'Depool 1163', diff saved to https://phabricator.wikimedia.org/P14321 and previous config saved to /var/cache/conftool/dbconfig/20210211-154902-jynus.json [production]
15:47 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org [production]
15:46 <gehel> depooling elastic2054 - T274555 [production]
15:45 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org [production]
15:45 <kormat@cumin1001> dbctl commit (dc=all): 'Pool db1163 at 1% T258361', diff saved to https://phabricator.wikimedia.org/P14320 and previous config saved to /var/cache/conftool/dbconfig/20210211-154501-kormat.json [production]
15:39 <gehel> powercycle elastic2054 - T274555 [production]
15:39 <gehel> powercycle elastic2054 [production]
15:22 <bstorm> deleted bstorm-(haproxy|toolforge|nfs)-test [testlabs]
14:44 <kormat@cumin1001> dbctl commit (dc=all): 'Add db1163 to s1 T258361', diff saved to https://phabricator.wikimedia.org/P14318 and previous config saved to /var/cache/conftool/dbconfig/20210211-144445-kormat.json [production]
14:26 <joal> Restart oozie API job after spark sharelib fix (start: 2021-02-10T18:00) [analytics]
14:24 <mholloway-shell@deploy1001> Synchronized wmf-config/InitialiseSettings.php: EventStreams: Update sampling config syntax for test.instrumentation.sampled (duration: 01m 08s) [production]
14:20 <joal> Rerun failed clicstream instance 2021-01 after sharelib fix [analytics]
14:16 <joal> Restart oozie after having fixed the spark-2.4.4 sharelib [analytics]
14:12 <joal> Fix oozie sharelib for spark-2.4.4 by copying oozie-sharelib-spark-4.3.0.jar onto the spark folder [analytics]
14:11 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2001.wikimedia.org [production]
14:02 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host netmon2001.wikimedia.org [production]
13:53 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1003.eqiad.wmnet [production]
13:48 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe1003.eqiad.wmnet [production]
13:48 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1002.eqiad.wmnet [production]
13:41 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe1002.eqiad.wmnet [production]
13:28 <godog> test grafana 7.4.1 upgrade on grafana2001 - T263747 [production]
13:27 <moritzm> re-adding ganeti5002 to the eqsin Ganeti cluster following mainboard replacement/reinstall T261130 [production]
13:22 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1001.eqiad.wmnet [production]
13:16 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe1001.eqiad.wmnet [production]
13:08 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2003.codfw.wmnet [production]
13:04 <hnowlan@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
13:03 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe2003.codfw.wmnet [production]
13:00 <hnowlan@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
12:57 <hnowlan@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . [production]
12:53 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2002.codfw.wmnet [production]
12:45 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe2002.codfw.wmnet [production]
12:41 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet [production]
12:40 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2001.codfw.wmnet [production]
12:40 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: d2b1df105afd9f9c9c047ae9c0a434674f43d505: Changing frwiktionary wmgBabelMainCategory (T274137) (duration: 01m 08s) [production]
12:37 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet [production]
12:35 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host thanos-fe2001.codfw.wmnet [production]
12:18 <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:662967|wikidata: post edit constraint jobs on 50% of edits (T204031)]] (up from 40%) (duration: 01m 08s) [production]
12:15 <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:662970|wikidata: add Dagbani to wmgExtraLanguageNames (T272242)]] (duration: 01m 29s) [production]
12:06 <jynus> restart-failed systemd on cumin1001 after s5 eqiad snapshot failed [production]
12:01 <arturo> [codfw1dev] drop instance `tools-codfw1dev-bastion-1` in `tools-codfw1dev` (was buster, cannot use it yet) [admin]
11:59 <arturo> [codfw1dev] create instance `tools-codfw1dev-bastion-2` (stretch) in `tools-codfw1dev` to test stuff related to T272397 [admin]
11:49 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor2002.codfw.wmnet [production]
11:45 <mvolz@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' . [production]
11:45 <arturo> [codfw1dev] create instance `tools-codfw1dev-bastion-1` in `tools-codfw1dev` to test stuff related to T272397 [admin]
11:42 <arturo> [codfw1dev] drop `tools` project, create `tools-codfw1dev` [admin]
11:41 <jynus@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: REIMAGE [production]
11:40 <mvolz@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' . [production]
11:39 <jynus@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: REIMAGE [production]
11:39 <jiji@cumin1001> START - Cookbook sre.hosts.reboot-single for host thumbor2002.codfw.wmnet [production]
11:38 <arturo> [codfw1dev] drop `coudinfra` project (we are using `cloudinfra-codfw1dev` there) [admin]