production SAL

1401-1450 of 10000 results (31ms)

2021-02-10 §
09:10	<elukey@cumin1001>	START - Cookbook sre.hadoop.change-distro-from-cdh-clients for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
09:00	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1076 (re)pooling @ 10%: Slowly repooling db1076 after cloning db1162', diff saved to https://phabricator.wikimedia.org/P14288 and previous config saved to /var/cache/conftool/dbconfig/20210210-090057-root.json	[production]
09:00	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1157 (re)pooling @ 60%: Slowly repool db1127', diff saved to https://phabricator.wikimedia.org/P14287 and previous config saved to /var/cache/conftool/dbconfig/20210210-090004-root.json	[production]
08:45	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1157 (re)pooling @ 40%: Slowly repool db1127', diff saved to https://phabricator.wikimedia.org/P14286 and previous config saved to /var/cache/conftool/dbconfig/20210210-084500-root.json	[production]
08:41	<legoktm>	depooling mw1404.eqiad.wmnet for perf benchmarking (T274041)	[production]
08:41	<legoktm@cumin1001>	conftool action : set/pooled=no; selector: name=mw1404.eqiad.wmnet	[production]
08:29	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1157 (re)pooling @ 20%: Slowly repool db1127', diff saved to https://phabricator.wikimedia.org/P14285 and previous config saved to /var/cache/conftool/dbconfig/20210210-082957-root.json	[production]
08:19	<godog>	swift eqiad-prod: decrease weight for SSDs on ms-be[1019-1026] - T272836	[production]
08:14	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1157 (re)pooling @ 10%: Slowly repool db1127', diff saved to https://phabricator.wikimedia.org/P14284 and previous config saved to /var/cache/conftool/dbconfig/20210210-081453-root.json	[production]
08:05	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1127 T266483', diff saved to https://phabricator.wikimedia.org/P14283 and previous config saved to /var/cache/conftool/dbconfig/20210210-080512-marostegui.json	[production]
06:43	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully pool db1170:3312, db1170:3317 T258361', diff saved to https://phabricator.wikimedia.org/P14282 and previous config saved to /var/cache/conftool/dbconfig/20210210-064330-marostegui.json	[production]
06:35	<marostegui@cumin1001>	dbctl commit (dc=all): 'Give more weight to db1170:3312, db1170:3317 T258361', diff saved to https://phabricator.wikimedia.org/P14281 and previous config saved to /var/cache/conftool/dbconfig/20210210-063534-marostegui.json	[production]
06:22	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1162.eqiad.wmnet with reason: REIMAGE	[production]
06:20	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: REIMAGE	[production]
06:19	<marostegui@cumin1001>	dbctl commit (dc=all): 'Pool db1170:3312, db1170:3317 with minimal weight for the first time T258361', diff saved to https://phabricator.wikimedia.org/P14279 and previous config saved to /var/cache/conftool/dbconfig/20210210-061924-marostegui.json	[production]
06:16	<marostegui@cumin1001>	dbctl commit (dc=all): 'Add db1170:3312 and db1170:3317 to dbctl, depooled T258361', diff saved to https://phabricator.wikimedia.org/P14278 and previous config saved to /var/cache/conftool/dbconfig/20210210-061638-marostegui.json	[production]
06:11	<jiji@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1020.eqiad.wmnet	[production]
06:04	<jiji@cumin1001>	START - Cookbook sre.hosts.reboot-single for host mc1020.eqiad.wmnet	[production]
05:58	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1076 to clone db1162 T258361', diff saved to https://phabricator.wikimedia.org/P14277 and previous config saved to /var/cache/conftool/dbconfig/20210210-055846-marostegui.json	[production]
03:46	<ryankemper>	`ryankemper@wdqs1012:~$ sudo systemctl restart wdqs-blazegraph.service`	[production]
01:54	<krinkle@deploy1001>	Finished deploy [integration/docroot@0234db2]: Unbreak doc.wm.o (2) - Ib67da94fb1bdf0 (duration: 00m 06s)	[production]
01:54	<krinkle@deploy1001>	Started deploy [integration/docroot@0234db2]: Unbreak doc.wm.o (2) - Ib67da94fb1bdf0	[production]
01:43	<krinkle@deploy1001>	Finished deploy [integration/docroot@fddc7c9]: Unbreak doc.wm.o - Ibf28e02ec03 (duration: 00m 06s)	[production]
01:43	<krinkle@deploy1001>	Started deploy [integration/docroot@fddc7c9]: Unbreak doc.wm.o - Ibf28e02ec03	[production]
01:06	<milimetric@deploy1001>	Finished deploy [analytics/refinery@b539bf6] (thin): Job fixes after Hadoop upgrade (duration: 00m 06s)	[production]
01:06	<milimetric@deploy1001>	Started deploy [analytics/refinery@b539bf6] (thin): Job fixes after Hadoop upgrade	[production]
01:06	<milimetric@deploy1001>	Finished deploy [analytics/refinery@b539bf6]: Job fixes after Hadoop upgrade (duration: 10m 55s)	[production]
00:58	<mutante>	doc1001 - reloaded apache2	[production]
00:55	<milimetric@deploy1001>	Started deploy [analytics/refinery@b539bf6]: Job fixes after Hadoop upgrade	[production]
00:42	<Amir1>	changing frwiki to wmf.30 in mwdebug1002 to test T264391	[production]
00:33	<ladsgroup@deploy1001>	Synchronized php-1.36.0-wmf.30/extensions/FeaturedFeeds: [[gerrit:662965\|Fix issues with recent caching update]] (T264391) (duration: 01m 10s)	[production]
00:22	<twentyafterfour@deploy1001>	Finished scap: testwikis wikis to 1.36.0-wmf.30 (duration: 24m 10s)	[production]
00:01	<twentyafterfour>	train status: wmf.28 and wmf.29 are undeployed. wmf.27 is everywhere with the exception of testwikis which is at wmf.30 refs T271344	[production]
2021-02-09 §
23:58	<twentyafterfour@deploy1001>	Started scap: testwikis wikis to 1.36.0-wmf.30	[production]
23:56	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2250.codfw.wmnet	[production]
23:55	<ryankemper>	Depooled `wdqs1005` - it's catching up on hours of lag	[production]
23:55	<twentyafterfour@deploy1001>	Finished scap: (no justification provided) (duration: 08m 43s)	[production]
23:53	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw2250.codfw.wmnet	[production]
23:50	<mutante>	mw1383,mw1385 - scap pull, php	[production]
23:48	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1296.eqiad.wmnet	[production]
23:47	<twentyafterfour>	running scap sync-world	[production]
23:47	<twentyafterfour@deploy1001>	Started scap: (no justification provided)	[production]
23:46	<twentyafterfour@deploy1001>	rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.27	[production]
23:40	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1296.eqiad.wmnet	[production]
23:33	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1380.eqiad.wmnet	[production]
23:32	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1380.eqiad.wmnet	[production]
23:28	<mutante>	mw1380 - powercycling after it did not come back from normal reboot during reimaging	[production]
23:23	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1372.eqiad.wmnet	[production]
23:18	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1372.eqiad.wmnet	[production]
23:05	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2250.codfw.wmnet with reason: REIMAGE	[production]