production SAL

5251-5300 of 10000 results (71ms)

2018-07-25 §
06:53	<gehel>	resetting postgres data on maps1004 after failing replication - T200228	[production]
06:38	<marostegui>	Stop replication in sync on db1091 and db1097:3314	[production]
06:38	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 db1091 (duration: 00m 48s)	[production]
06:33	<jynus>	finished es1014 -> es1017 switch T197073	[production]
06:27	<jynus>	enabling semi-sync master on es1017, disabling it as client	[production]
06:21	<jynus>	deploy es3-master dns change	[production]
06:02	<jynus@deploy1001>	Synchronized wmf-config/db-eqiad.php: Switchover es3 master eqiad from es1014 to es1017 (duration: 00m 24s)	[production]
06:01	<jynus>	switchover es3 eqiad master from es1014 to es1017	[production]
04:35	<tstarling@deploy1001>	Synchronized php-1.32.0-wmf.13/includes/api/ApiMain.php: record all API requests in statsd (duration: 00m 49s)	[production]
02:43	<l10nupdate@deploy1001>	ResourceLoader cache refresh completed at Wed Jul 25 02:43:05 UTC 2018 (duration 10m 19s)	[production]
02:32	<l10nupdate@deploy1001>	scap sync-l10n completed (1.32.0-wmf.13) (duration: 13m 28s)	[production]
2018-07-24 §
21:59	<XioNoX>	re-pooling eqsin	[production]
21:15	<XioNoX>	re1 is master routing engine on cr1-eqsin, triggering a re switch	[production]
21:10	<XioNoX>	starting to see recoveries from cr1-eqsin upgrade	[production]
21:06	<XioNoX>	Install done, cr1-eqsin re-rebooting	[production]
21:00	<XioNoX>	restarting cr1-eqsin for software upgrade	[production]
20:32	<XioNoX>	depooling eqsin for cr1-eqsin software upgrade	[production]
19:09	<gehel>	resetting postgres data on maps1003 after failing replication - T200228	[production]
18:34	<mobrovac@deploy1001>	Finished deploy [eventstreams/deploy@690fdad]: Wait for the client to consume the meesage being sent before consuming the next one - T199813 (duration: 02m 18s)	[production]
18:32	<mobrovac@deploy1001>	Started deploy [eventstreams/deploy@690fdad]: Wait for the client to consume the meesage being sent before consuming the next one - T199813	[production]
17:40	<ema>	re-enable puppet on all cache nodes with alternate domains disabled T164609	[production]
17:33	<zfilipin@deploy1001>	rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.13	[production]
17:18	<thcipriani>	train window running long, services deploy delayed	[production]
17:18	<ema>	restart varnish-fe on cp1068 to clear "child restarted" alert T164609	[production]
17:17	<elukey>	restart eventstreams on scb2* nodes (hopefully last time before deploying the fix) to avoid mem leaks issues during the EU night	[production]
17:06	<ladsgroup@deploy1001>	Synchronized php-1.32.0-wmf.14/includes/page/PageArchive.php: [[gerrit:447636\|PageArchive: Pass correct overrides to newRevisionFromArchiveRow() (T200072)]] (duration: 01m 01s)	[production]
16:58	<jynus>	finishing test on es3 hosts T199224	[production]
16:42	<ladsgroup@deploy1001>	Synchronized php-1.32.0-wmf.13/includes/page/PageArchive.php: [[gerrit:447636\|PageArchive: Pass correct overrides to newRevisionFromArchiveRow() (T200072)]] (duration: 01m 03s)	[production]
16:07	<jynus>	test switchover from es2018 to es2017	[production]
16:02	<dcausse>	T156137: unbanning elastic1031	[production]
15:59	<dcausse>	T156137: restarting elasticsearch on elastic1031 to disable G1GC	[production]
15:55	<jynus>	test switchover from es2017 to es2018	[production]
15:46	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 (duration: 01m 02s)	[production]
15:38	<jynus>	stopping puppet on es2017, es2018; changing mysql configuration for production testing	[production]
15:29	<gehel>	restart postgres on maps1001 - T200228	[production]
14:53	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Repool db1084 (duration: 01m 02s)	[production]
14:47	<dcausse>	T156137: banning elastic1031 due to high load (same "getEntryAfterMiss" symptoms)	[production]
14:09	<marostegui>	Deploy schema change on db1103:3314 T144010 T51190 T199368	[production]
13:45	<ema>	apply alternate domains patch to text-eqiad T164609	[production]
13:43	<marostegui>	Deploy schema change on db1084 T144010 T51190 T199368	[production]
13:42	<marostegui>	Stop replication in sync db1084 and db1103:3314	[production]
13:40	<marostegui>	Deploy schema change on db1081 T144010 T51190 T199368	[production]
13:38	<marostegui>	Stop replication in sync db1081 and db1103:3314	[production]
13:38	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 01m 59s)	[production]
13:07	<ema>	repool cp1067 with alternate domains support T164609	[production]
12:59	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 (duration: 00m 55s)	[production]
12:11	<gehel>	vacuum full of postgres on maps1001 to try to reclaim space - T200228	[production]
12:07	<ema>	depool cp1067 to test alternate domains patch T164609	[production]
11:58	<zfilipin@deploy1001>	scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_4179557944" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 02m 50s)	[production]
11:55	<zfilipin@deploy1001>	Started scap: testwiki to php-1.32.0-wmf.14 and rebuild l10n cache	[production]