production SAL

1801-1850 of 10000 results (60ms)

2019-06-20 §
16:16	<Krinkle>	scb1001 is producing 120,000 errors per minute as of 16:09 UTC minute ago (under 500/min before that)	[production]
15:40	<Krinkle>	krinkle@deploy1001: pull down 98399b1032a0 to wmf.10 (test-only change)	[production]
15:05	<jijiki>	Rolling restart php-fpm on jobrunners to pick up new opcache settings - 518023	[production]
15:03	<jijiki>	Repool mw1311	[production]
15:01	<jeh>	T101631 updating replica views on labsdb1009	[production]
14:58	<akosiaris>	make sure all kubernetes hosts (except kubernetes2001 which is used to investigate some outgoing packet discards) are pooled and with the exact same weight	[production]
14:57	<jijiki>	enable puppet on jobrunners	[production]
14:57	<akosiaris@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=kubernetes1005.*	[production]
14:57	<akosiaris@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=kubernetes1006.*	[production]
14:56	<akosiaris@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=kubernetes2006.*	[production]
14:56	<akosiaris@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=kubernetes2005.*	[production]
14:54	<jeh>	T101631 updating replica views on labsdb1010	[production]
14:47	<jeh>	T101631 updating replica views on labsdb1011	[production]
14:41	<akosiaris@puppetmaster1001>	conftool action : set/pooled=no; selector: dc=codfw,cluster=kubernetes,service=eventgate-main,name=kubernetes2001.codfw.wmnet	[production]
14:36	<jeh>	T101631 updating replica views on labsdb1012	[production]
14:28	<Amir1>	end of ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=testwikidatawiki --batch-size=100 --sleep=3 (T225052)	[production]
14:22	<ladsgroup@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:517669\|Set EntityUsageTable addUsage batch size to 150 (T225500)]] (duration: 00m 56s)	[production]
14:18	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]
14:18	<Amir1>	start of ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyTerms.php --wiki=testwikidatawiki --batch-size=100 --sleep=3 (T225052)	[production]
14:16	<akosiaris@puppetmaster1001>	conftool action : set/pooled=no; selector: dc=codfw,cluster=kubernetes,service=eventgate-analytics,name=kubernetes2001.codfw.wmnet	[production]
14:16	<akosiaris@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=kubernetes2001.*	[production]
14:14	<ladsgroup@deploy1001>	Synchronized wmf-config/Wikibase.php: SWAT: [[gerrit:518028\|Switch property terms migration to WRITE_BOTH on test wikidata (T225051)]] (duration: 00m 56s)	[production]
14:14	<akosiaris@puppetmaster1001>	conftool action : set/pooled=no; selector: dc=codfw,cluster=kubernetes,service=eventgate-main,name=kubernetes2001.codfw.wmnet	[production]
14:14	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]
14:13	<akosiaris@puppetmaster1001>	conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes,service=eventgate-analytics,name=kubernetes2001.codfw.wmnet	[production]
14:12	<akosiaris@puppetmaster1001>	conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes,service=eventgate-main,name=kubernetes2001.codfw.wmnet	[production]
14:11	<ema@cumin1001>	START - Cookbook sre.hosts.upgrade-and-reboot	[production]
14:10	<akosiaris@puppetmaster1001>	conftool action : set/pooled=no; selector: name=kubernetes2001.*	[production]
14:09	<ladsgroup@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:518028\|Switch property terms migration to WRITE_BOTH on test wikidata (T225051)]] (duration: 00m 56s)	[production]
14:06	<ema@cumin1001>	START - Cookbook sre.hosts.upgrade-and-reboot	[production]
14:04	<ema@cumin1001>	END (FAIL) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=99)	[production]
13:58	<ema@cumin1001>	START - Cookbook sre.hosts.upgrade-and-reboot	[production]
13:56	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]
13:50	<ema@cumin1001>	START - Cookbook sre.hosts.upgrade-and-reboot	[production]
13:38	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]
13:35	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]
13:31	<ema@cumin1001>	START - Cookbook sre.hosts.upgrade-and-reboot	[production]
13:28	<ema@cumin1001>	START - Cookbook sre.hosts.upgrade-and-reboot	[production]
13:23	<marostegui>	Stop replication on labsdb1011 to defragment tables T222978	[production]
13:22	<ema@cumin1001>	END (FAIL) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=99)	[production]
13:21	<jijiki>	depool mw1311	[production]
13:20	<marostegui>	Reload haproxy on dbproxy1010 and dbproxy1011 to depool labsdb1011 - T222978	[production]
13:16	<ema@cumin1001>	START - Cookbook sre.hosts.upgrade-and-reboot	[production]
13:11	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]
13:04	<ema@cumin1001>	START - Cookbook sre.hosts.upgrade-and-reboot	[production]
12:59	<jijiki>	Disable puppet on jobrunners to merge 518023 and 518018	[production]
12:56	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]
12:50	<ema>	powercycle cp2017, stuck rebooting	[production]
12:44	<hashar>	Upgrading packages on contint1001	[production]
12:44	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]