production SAL

1-50 of 10000 results (22ms)

2015-09-24 §
23:02	<krenair@tin>	Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/240880/ (duration: 00m 17s)	[production]
19:57	<ori@tin>	Synchronized php-1.26wmf24/extensions/ContentTranslation: d079d5dd71: Updated mediawiki/core Project: mediawiki/extensions/ContentTranslation 8559ee614975f25b71a732ca0fb1bb6d489c9d33 (duration: 00m 18s)	[production]
19:35	<bblack>	depooled cp1046 from confd, committed pybal depool for LVS as well	[production]
19:34	<chasemp>	changing labs route on cr1 and cr2 from 10.68.16.0/22 to 10.68.16.0/21 which matches references, fw setting and manifests/network.pp	[production]
18:54	<catrope@tin>	Synchronized php-1.26wmf24/extensions/Flow/: Debugging for FlowFixLinks.php (duration: 00m 20s)	[production]
18:21	<twentyafterfour@tin>	rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedia wikis to 1.26wmf24	[production]
18:20	<legoktm>	moved oauthadmin group from User:Yuvipanda@metawiki to User:YuviPanda@metawiki	[production]
18:19	<godog>	restart restbase on restbase1005	[production]
18:18	<godog>	restart restbase on restbase1004	[production]
18:16	<godog>	restart restbase on restbase1003	[production]
18:06	<paravoid>	depooling cp1046, stability issues	[production]
18:00	<demon@tin>	Synchronized multiversion/MWRealm.php: (no message) (duration: 00m 17s)	[production]
17:59	<ori>	Merged Apache config change Ia095457fb. It will refresh the Apache service as it rolls out, causing elevated 503s for the next 20 minutes.	[production]
17:53	<godog>	rolling restart restbase in eqiad	[production]
17:35	<chasemp>	powercycling cp1046 at mgmt as I can't ssh in and it seems like it should be up	[production]
17:26	<godog>	bounce restbase on restbase1002, apply new datacenter config	[production]
17:10	<_joe_>	cleaning up /tmp on mw1152	[production]
17:09	<cmjohnson1>	powering down for the last time es1001 - es1010	[production]
16:17	<thcipriani@tin>	Synchronized php-1.26wmf23/extensions/Wikidata: SWAT: Do not filter affected pages by namespace [[gerrit:240727]] (duration: 00m 26s)	[production]
16:01	<robh>	nothing on puppet swat window, easiest swat ever.	[production]
15:46	<thcipriani@tin>	Synchronized php-1.26wmf24/extensions/Wikidata: SWAT: Do not filter affected pages by namespace [[gerrit:240711]] (duration: 00m 26s)	[production]
15:23	<thcipriani@tin>	Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable suggestions in ca, en, es, fr, it, ja, tr, ru, zh [[gerrit:240638]] (duration: 00m 17s)	[production]
14:37	<paravoid>	repooling codfw	[production]
12:54	<bblack>	restarting varnish daemons on second half of maps, parsoid, misc clusters (package upgrade, shm_reclen change)	[production]
12:50	<bblack>	restarting varnishd instances on text, mobile, upload clusters for package upgrade (slow salt, no parallelism, ~5m spacing - FE cache loss, BE cache stays, should take ~9h)	[production]
12:05	<moritzm>	installed rpcbind security updates on eeden, baham, radon, maerlant, rhenium	[production]
11:56	<bblack>	restarting varnish daemons on half of maps, parsoid, misc clusters (package upgrade, shm_reclen change)	[production]
11:36	<bblack>	reinstall lvs300[12] to jessie - T96375	[production]
11:21	<akosiaris>	killed tail -f varnishncsa.log on cp1065 and ran apt-get clean to reclaim some disk space	[production]
11:14	<bblack>	stopping pybal on lvs300[12]; lvs300[34] taking over	[production]
11:07	<bblack>	upgrading varnishes to 3.0.6plus-wm8 (non-restarting, just pkg update on-disk)	[production]
09:40	<jynus>	performing latest (software) steps to decom es1001-es1010 (puppet disabling, etc.)	[production]
08:39	<jynus>	restarted HHVM @ mw1056, 1104, 1122	[production]
05:33	<yuvipanda>	deleted logstash indexes for 08/27 and 28 too	[production]
05:31	<yuvipanda>	deleted indexes for 08/14, 15, 25, 26 on logstash	[production]
03:59	<yuvipanda>	restarting elasticsearch in logstash1001-3	[production]
03:53	<yuvipanda>	restarted es on logstash1004-6	[production]
03:02	<yuvipanda>	jstack dumped logstash output onto /home/yuvipanda/stack on logstash1001 since strace seems useles	[production]
02:51	<yuvipanda>	restarted logstash on logstash1002	[production]
02:41	<yuvipanda>	gmond at 100% again, killing it and stopping puppet again	[production]
02:40	<yuvipanda>	re-enabling and running puppet on hafnium to see what it's bringing up	[production]
02:38	<l10nupdate@tin>	Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 30s)	[production]
02:23	<yuvipanda>	kill gmond on hafnium and disable puppet to prevent it from taking it back up. Was taking 100% CPU	[production]
02:16	<Krinkle>	Kibana/Logstash outage. Zero events received after 2015-09-23T23:59:59.999Z.	[production]
02:14	<Krinkle>	Partial EventLogging outage (client-side events via hafnium abruptly stopped 2015-09-23 11:36 UTC - 15 hours ago)	[production]
01:53	<mutante>	started logstash on logstash1002 again	[production]
01:35	<mutante>	bast1001: unmounting /srv/home_pmtpa (backup on bacula)	[production]
01:34	<mutante>	removing subversion packages from bast1001	[production]
01:15	<ori@tin>	Synchronized php-1.26wmf24/includes: Ifa0d4cfe8e3: Backport I1ff61153d and I8e4c3d5a5 (duration: 00m 23s)	[production]
00:19	<jynus>	restarted replication on db1051	[production]