production SAL

4301-4350 of 10000 results (67ms)

2018-10-24 §
19:38	<twentyafterfour>	The train is now blocked by database lock contention of unknown origin	[production]
19:31	<twentyafterfour>	the errors were all coming from wmf.26 but the error rate skyrocketed after deploying 1.33.0-wmf.1 to group1 so there is some query in the new branch which is holding a lock. T207881	[production]
19:19	<twentyafterfour@deploy1001>	rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.1 refs T206655	[production]
18:16	<XioNoX>	enable BGP sessions to transit/peering on cr2-eqord - T204170	[production]
17:20	<gehel>	repooling all elasticsearch servers in eqiad	[production]
17:12	<cmjohnson1>	rebooting cloudvirt1019	[production]
17:04	<jforrester@deploy1001>	Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s)	[production]
17:03	<jforrester@deploy1001>	scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)	[production]
16:36	<jforrester@deploy1001>	Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s)	[production]
16:31	<addshore@deploy1001>	Synchronized wmf-config/Wikibase.php: [[gerrit:469444]] Wikibase.php, dont load wikidata repo settings on other repos (take 2) (duration: 00m 54s)	[production]
16:04	<XioNoX>	power-off cr1-eqord - T204170	[production]
16:00	<twentyafterfour>	15:59:06 Synchronized php-1.33.0-wmf.1/extensions/EventBus/: revert "Set event datetime with microsecond resolution." on 1.33.0-wmf.1 refs T207817 (duration: 00m 56s)	[production]
15:59	<XioNoX>	disable BGP sessions to transit/peering on cr1-eqord - T204170	[production]
15:54	<twentyafterfour>	deploying https://gerrit.wikimedia.org/r/469451	[production]
14:23	<herron>	scheduled icinga downtime and disabling puppet on logstash hosts. deploying role::kafka::logging to logstash elasticserach data hosts	[production]
13:35	<XioNoX>	pre-configure switch ports for labvirt1007/8/9/12:eth1 in cloud-virt-instance-trunk range on asw2-b-eqiad	[production]
13:17	<ema>	begin cache hosts rolling reboots for kernel/microcode updates T203011	[production]
12:24	<ema>	cp-ats: upgrade trafficserver to 8.0.0-1wm1 T204232	[production]
12:12	<ema>	cp1072: upgrade trafficserver to 8.0.0-1wm1 T204232	[production]
11:22	<ema>	cp1071: upgrade trafficserver to 8.0.0-1wm1 T204232	[production]
10:56	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Restore db1092 and db1104 original weight (duration: 00m 52s)	[production]
10:35	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 and starting to restore db1104 original weight (duration: 00m 54s)	[production]
10:28	<marostegui>	Compare revision table on dewiki cebwiki shwiki srwiki mgwiktionary enwikivoyage on db1100 and db2075 - T184805	[production]
09:54	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 (duration: 00m 54s)	[production]
09:39	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 (duration: 00m 54s)	[production]
09:20	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Slowly repool db1092 and db1087 (duration: 01m 05s)	[production]
08:55	<marostegui>	Stop MySQL for upgrade and reboot on db1087	[production]
08:47	<marostegui>	Update MySQL on db1092 for upgrade and reboot	[production]
08:03	<godog>	fix aggregation to 'sum' for MediaWiki.RevisionSlider - T205416	[production]
07:33	<gehel>	powercycling wdqs1010 - T207817	[production]
07:19	<_joe_>	powercycling wdqs1009	[production]
07:04	<elukey>	powercycle wdqs1008	[production]
06:59	<elukey>	powercycle wdqs1007	[production]
06:55	<elukey>	powercycle wdqs1006 (depool first)	[production]
06:46	<elukey>	powercycle wdqs1005	[production]
06:42	<SMalyshev>	repooled wdqs1003	[production]
06:35	<_joe_>	powercycling wdqs[2001-2002,2004-2006].codfw.wmnet, one at a time	[production]
06:33	<elukey>	powercycle wdqs1004	[production]
05:24	<kartik@deploy1001>	Finished deploy [cxserver/deploy@80dc518]: Update cxserver to 9ad60d9 (T207445) (duration: 04m 06s)	[production]
05:20	<kartik@deploy1001>	Started deploy [cxserver/deploy@80dc518]: Update cxserver to 9ad60d9 (T207445)	[production]
02:34	<mutante>	powercycled wdqs1009 - by request	[production]
02:24	<onimisionipe@deploy1001>	Finished deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue (duration: 00m 03s)	[production]
02:24	<onimisionipe@deploy1001>	Started deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue	[production]
02:12	<onimisionipe@deploy1001>	Finished deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue (duration: 00m 23s)	[production]
02:12	<onimisionipe@deploy1001>	Started deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue	[production]
01:56	<tstarling@deploy1001>	Synchronized php-1.33.0-wmf.1/includes/page/WikiPage.php: T207530 (duration: 00m 53s)	[production]
01:46	<tstarling@deploy1001>	Synchronized php-1.32.0-wmf.26/includes/page/WikiPage.php: fix deletion performance regression T207530 (duration: 00m 55s)	[production]
01:37	<bawolff>	deployed T207750	[production]
01:24	<mutante>	wdqs2005 - powercycled, wasnt reachable via SSH and also couldn't login on mgmt, mgmt full of jave exceptions from wdqs-updater	[production]
00:28	<twentyafterfour@deploy1001>	rebuilt and synchronized wikiversions files: group0 wikis to 1.33.0-wmf.1 refs T206655	[production]