__all__ SAL

101-150 of 10000 results (39ms)

2021-05-06 §
15:58	<Amir1>	starting upgrade of public mailing lists in group d and e (T280322)	[production]
15:50	<ryankemper@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1012.eqiad.wmnet with reason: REIMAGE	[production]
15:47	<ryankemper@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1012.eqiad.wmnet with reason: REIMAGE	[production]
15:42	<papaul>	powerdown logstash2027 for relocation	[production]
15:41	<mvolz@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .	[production]
15:40	<ryankemper@cumin1001>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw reboot - ryankemper@cumin1001 - T280563	[production]
15:34	<XioNoX>	push cloud-gw-transport-eqiad to asw2-b-eqiad and cloudsw	[production]
15:33	<ryankemper@cumin1001>	START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw reboot - ryankemper@cumin1001 - T280563	[production]
15:32	<ryankemper>	T280382 `sudo -i wmf-auto-reimage-host -p T280382 wdqs1012.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `reimage`	[production]
15:32	<ryankemper>	T280382 `sudo -i wmf-auto-reimage-host -p T280382 wdqs2003.codfw.wmnet` on `ryankemper@cumin1001` tmux session `reimage`	[production]
15:31	<arturo>	about to migrating CloudVPS network to the cloudgw architecture T270704	[admin]
15:31	<mvolz@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .	[production]
15:29	<cdanis@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin1001.eqiad.wmnet with reason: quiz	[production]
15:29	<cdanis@cumin1001>	START - Cookbook sre.hosts.downtime for 0:05:00 on cumin1001.eqiad.wmnet with reason: quiz	[production]
15:26	<ryankemper>	T280382 [WDQS] Pooled `wdqs1007` and `wdqs2004`	[production]
15:26	<ryankemper>	T280382 `wdqs2004.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2 2.6T 998G 1.5T 40% /srv`	[production]
15:26	<ryankemper>	T280382 `wdqs1007.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2 2.6T 998G 1.5T 40% /srv`	[production]
15:20	<mvolz@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .	[production]
15:16	<mvolz@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .	[production]
15:14	<papaul>	powerdown ms-be2053 for relocation	[production]
15:10	<moritzm>	imported wmfbackups 0.5+deb11u1 for bullseye-wikimedia to apt.wikimedia.org	[production]
15:07	<aborrero@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: T270704	[production]
15:06	<aborrero@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: T270704	[production]
15:06	<aborrero@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 105 hosts with reason: T270704	[production]
15:06	<aborrero@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 105 hosts with reason: T270704	[production]
15:06	<mvolz@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .	[production]
15:05	<moritzm>	imported wmfmariadbpy 0.6+deb11u1 for bullseye-wikimedia to apt.wikimedia.org	[production]
14:55	<papaul>	powerdown kafka-main2002 for relocation	[production]
14:43	<Majavah>	clear error states from all currently erroring exec nodes	[tools]
14:37	<Majavah>	clear error state from tools-sgeexec-0913	[tools]
14:30	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db1113:3315', diff saved to https://phabricator.wikimedia.org/P15833 and previous config saved to /var/cache/conftool/dbconfig/20210506-143002-marostegui.json	[production]
14:09	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1113:3315 for schema change', diff saved to https://phabricator.wikimedia.org/P15829 and previous config saved to /var/cache/conftool/dbconfig/20210506-140916-marostegui.json	[production]
13:37	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 100%: Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P15828 and previous config saved to /var/cache/conftool/dbconfig/20210506-133738-root.json	[production]
13:29	<elukey>	roll restart of hadoop yarn nodemanagers to pick up TasksMax=26214	[analytics]
13:22	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 75%: Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P15827 and previous config saved to /var/cache/conftool/dbconfig/20210506-132234-root.json	[production]
13:21	<XioNoX>	push pfw policies - T281942	[production]
13:07	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 50%: Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P15826 and previous config saved to /var/cache/conftool/dbconfig/20210506-130730-root.json	[production]
12:52	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 25%: Repool db1144:3315', diff saved to https://phabricator.wikimedia.org/P15825 and previous config saved to /var/cache/conftool/dbconfig/20210506-125226-root.json	[production]
12:39	<elukey>	restart Yarn RMs to apply the dominant resource calculator setting - T281792	[analytics]
12:15	<hnowlan>	changed eventlogging CNAME to point to eventlog1003	[analytics]
11:44	<hnowlan@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts eventlog1002.eqiad.wmnet	[production]
11:35	<mlitn@deploy1002>	Synchronized wmf-config: Config: [[gerrit:685752\|Enable Extension:MediaSearch on betacommons (T265939)]] (duration: 01m 06s)	[production]
11:34	<mlitn@deploy1002>	sync-file aborted: Config: [[gerrit:685752\|Enable Extension:MediaSearch on betacommons (T265939)]] (duration: 00m 56s)	[production]
11:34	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: REIMAGE	[production]
11:31	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: REIMAGE	[production]
11:30	<hnowlan@cumin1001>	START - Cookbook sre.hosts.decommission for hosts eventlog1002.eqiad.wmnet	[production]
11:28	<hnowlan@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts eventlog1002.eqiad.wmnet	[production]
11:27	<hnowlan@cumin1001>	START - Cookbook sre.hosts.decommission for hosts eventlog1002.eqiad.wmnet	[production]
11:23	<wmde-fisch@deploy1002>	Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:685554\|Enable ReferencePreviews as full default on pilot wikis (T271206)]] (duration: 01m 06s)	[production]
11:22	<wmde-fisch@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:685554\|Enable ReferencePreviews as full default on pilot wikis (T271206)]] (duration: 01m 06s)	[production]