3051-3100 of 10000 results (37ms)
2021-05-06 ยง
17:27 <bblack@cumin1001> conftool action : set/pooled=no; selector: name=cp203[34].codfw.wmnet [production]
17:20 <jgiannelos@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
17:15 <volans> upgrade spicerack on cumin* to 0.0.52 [production]
17:15 <ryankemper> [Elastic] Set `elastic2043` as the only banned node in Cirrussearch Elasticsearch clusters (`elastic2058-production-search-codfw`, `elastic2058-production-search-omega-codfw`, `elastic2058-production-search-psi-codfw`) [production]
17:13 <papaul> powerdown ms-be2057 for relocation [production]
17:13 <jgiannelos@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
17:12 <volans> uploaded spicerack_0.0.52 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia [production]
17:00 <papaul> powerdown elastic2058 for relocation [production]
16:43 <vgutierrez> Enforce Puppet Internal CA validation on trafficserver@ulsfo - T281673 [production]
16:12 <papaul> powerdown mc-gp2002 for relocation [production]
16:09 <ryankemper> [Elastic] Set `elastic2058` as the only banned node in Cirrussearch Elasticsearch clusters (`elastic2058-production-search-codfw`, `elastic2058-production-search-omega-codfw`, `elastic2058-production-search-psi-codfw`) [production]
15:58 <Amir1> starting upgrade of public mailing lists in group d and e (T280322) [production]
15:50 <ryankemper@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1012.eqiad.wmnet with reason: REIMAGE [production]
15:47 <ryankemper@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1012.eqiad.wmnet with reason: REIMAGE [production]
15:42 <papaul> powerdown logstash2027 for relocation [production]
15:41 <mvolz@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' . [production]
15:40 <ryankemper@cumin1001> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw reboot - ryankemper@cumin1001 - T280563 [production]
15:34 <XioNoX> push cloud-gw-transport-eqiad to asw2-b-eqiad and cloudsw [production]
15:33 <ryankemper@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw reboot - ryankemper@cumin1001 - T280563 [production]
15:32 <ryankemper> T280382 `sudo -i wmf-auto-reimage-host -p T280382 wdqs1012.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `reimage` [production]
15:32 <ryankemper> T280382 `sudo -i wmf-auto-reimage-host -p T280382 wdqs2003.codfw.wmnet` on `ryankemper@cumin1001` tmux session `reimage` [production]
15:31 <mvolz@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' . [production]
15:29 <cdanis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin1001.eqiad.wmnet with reason: quiz [production]
15:29 <cdanis@cumin1001> START - Cookbook sre.hosts.downtime for 0:05:00 on cumin1001.eqiad.wmnet with reason: quiz [production]
15:26 <ryankemper> T280382 [WDQS] Pooled `wdqs1007` and `wdqs2004` [production]
15:26 <ryankemper> T280382 `wdqs2004.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2 2.6T 998G 1.5T 40% /srv` [production]
15:26 <ryankemper> T280382 `wdqs1007.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2 2.6T 998G 1.5T 40% /srv` [production]
15:20 <mvolz@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' . [production]
15:16 <mvolz@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' . [production]
15:14 <papaul> powerdown ms-be2053 for relocation [production]
15:10 <moritzm> imported wmfbackups 0.5+deb11u1 for bullseye-wikimedia to apt.wikimedia.org [production]
15:07 <aborrero@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: T270704 [production]
15:06 <aborrero@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: T270704 [production]
15:06 <aborrero@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 105 hosts with reason: T270704 [production]
15:06 <aborrero@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 105 hosts with reason: T270704 [production]
15:06 <mvolz@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' . [production]
15:05 <moritzm> imported wmfmariadbpy 0.6+deb11u1 for bullseye-wikimedia to apt.wikimedia.org [production]
14:55 <papaul> powerdown kafka-main2002 for relocation [production]
14:30 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1113:3315', diff saved to https://phabricator.wikimedia.org/P15833 and previous config saved to /var/cache/conftool/dbconfig/20210506-143002-marostegui.json [production]
14:09 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1113:3315 for schema change', diff saved to https://phabricator.wikimedia.org/P15829 and previous config saved to /var/cache/conftool/dbconfig/20210506-140916-marostegui.json [production]
13:37 <marostegui@cumin1001> dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 100%: Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P15828 and previous config saved to /var/cache/conftool/dbconfig/20210506-133738-root.json [production]
13:22 <marostegui@cumin1001> dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 75%: Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P15827 and previous config saved to /var/cache/conftool/dbconfig/20210506-132234-root.json [production]
13:21 <XioNoX> push pfw policies - T281942 [production]
13:07 <marostegui@cumin1001> dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 50%: Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P15826 and previous config saved to /var/cache/conftool/dbconfig/20210506-130730-root.json [production]
12:52 <marostegui@cumin1001> dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 25%: Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P15825 and previous config saved to /var/cache/conftool/dbconfig/20210506-125226-root.json [production]
11:44 <hnowlan@cumin1001> END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts eventlog1002.eqiad.wmnet [production]
11:35 <mlitn@deploy1002> Synchronized wmf-config: Config: [[gerrit:685752|Enable Extension:MediaSearch on betacommons (T265939)]] (duration: 01m 06s) [production]
11:34 <mlitn@deploy1002> sync-file aborted: Config: [[gerrit:685752|Enable Extension:MediaSearch on betacommons (T265939)]] (duration: 00m 56s) [production]
11:34 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: REIMAGE [production]
11:31 <kormat@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: REIMAGE [production]