production SAL

1151-1200 of 10000 results (23ms)

2014-07-16 §
08:52	<godog>	repool ms-fe1003 and depool ms-fe1004	[production]
08:46	<godog>	repool ms-fe1002 and depool ms-fe1003	[production]
08:39	<godog>	depool ms-fe1002 for swift upgrade	[production]
05:54	<springle>	resuming page content model schema changes, osc_host.sh processes on terbium ok to kill in emergency	[production]
04:23	<springle>	restarted gitblit on antimony	[production]
03:04	<LocalisationUpdate>	ResourceLoader cache refresh completed at Wed Jul 16 03:03:41 UTC 2014 (duration 3m 40s)	[production]
02:27	<LocalisationUpdate>	completed (1.24wmf13) at 2014-07-16 02:26:12+00:00	[production]
02:15	<LocalisationUpdate>	completed (1.24wmf12) at 2014-07-16 02:14:32+00:00	[production]
01:34	<manybubbles>	moving shards off of elastic101[789]	[production]
2014-07-15 §
23:20	<maxsem>	Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/146615/ (duration: 00m 04s)	[production]
23:16	<maxsem>	Synchronized php-1.24wmf12/extensions/CirrusSearch/: https://gerrit.wikimedia.org/r/#q,146471,n,z (duration: 00m 05s)	[production]
23:14	<maxsem>	Synchronized php-1.24wmf13/includes/specials/SpecialVersion.php: (no message) (duration: 00m 04s)	[production]
23:13	<maxsem>	Synchronized php-1.24wmf13/extensions/CirrusSearch/: https://gerrit.wikimedia.org/r/#q,146471,n,z (duration: 00m 04s)	[production]
22:35	<K4-713>	synchronized payments to afa12be34769000bf8	[production]
21:34	<_joe_>	disabling puppet on mw1001, tests	[production]
21:26	<aude>	Synchronized php-1.24wmf13/extensions/Wikidata: Update submodule to fix entity search issue on Wikidata (duration: 00m 21s)	[production]
21:15	<ori>	to test r146607, locally modified upstart conf for jobrunner on mw1001 to log to /var/log/mediawiki, and restarted service	[production]
20:24	<ori>	restarted jobrunner on all jobrunners	[production]
20:23	<AaronSchulz>	Deployed /srv/jobrunner to 31e54c564d369e89613db48977eec0a5891b6498	[production]
20:21	<reedy>	Synchronized docroot and w: (no message) (duration: 00m 21s)	[production]
20:18	<reedy>	rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf13	[production]
20:13	<Krinkle>	Reloading Zuul to deploy If2312bcf18bdbe8dee	[production]
20:12	<bd808>	log volume up after logstash restart	[production]
20:10	<bd808>	restarted logstash on logstash1001; log volume looked to be down from "normal"	[production]
19:55	<Reedy>	Applied extensions/UploadWizard/UploadWizard.sql to rowiki (re bug 59242)	[production]
18:53	<manybubbles>	bouncing elastic1018 to pick up new merge policy. hopefully that'll help with io thrashing	[production]
17:58	<ori>	_joe_ deployed jobrunner to all job runners	[production]
17:40	<manybubbles>	my last attempt to lower the concurrent traffic for recovery was a failure - tried again and succeeded. that seems to have fixed the echo service disruption from taking elastic1017 out of service	[production]
17:37	<ori>	updated jobrunner to bef32b9120	[production]
17:29	<manybubbles>	elastic1017 went nuts again. just shutting elasticsearch off on it for now	[production]
16:25	<_joe_>	all mw servers updated	[production]
16:10	<_joe_>	mw1100 and onwards updated	[production]
16:00	<_joe_>	mw1060-mw1099 updated	[production]
15:58	<manybubbles>	restarting Elasticsearch on elastic1017 - its thrashing the disk again. I'm still not 100% sure why	[production]
15:57	<_joe_>	mw1020-mw1059 updated	[production]
15:53	<_joe_>	mw101[0-9] updated	[production]
15:47	<_joe_>	starting rolling update of all appservers to apache2 2.2.22-1ubuntu1.6, half of them are on 2.2.22-1ubuntu1.5 now	[production]
15:42	<manybubbles>	setting the filter cache on one node in the cluster set it on all. yay, I guess. Anyway, I'm going to let it soak for a while.	[production]
15:32	<manybubbles>	setting filter cache size to 20% on elastic1001 to see if it takes/helps us	[production]
15:19	<anomie>	Synchronized wmf-config/: SWAT: Remove dead ULS variable [[gerrit:145861]] (duration: 00m 10s)	[production]
15:18	<anomie>	anomie actually committed a live hack someone left on tin (removing db1035)	[production]
15:16	<anomie>	updated /a/common to {{Gerrit\|I7ca6a16d5}}: Switch jawiki back to lsearchd	[production]
13:42	<manybubbles>	Synchronized wmf-config/InitialiseSettings.php: jawiki back to lsearchd (duration: 00m 05s)	[production]
13:38	<manybubbles>	elastic1017 had a load average of 60 - was thashing in io. bounced Elasticsearch. lets see if it recovers on its own	[production]
09:09	<_joe_>	restarting mailman on sodium, again, for testing	[production]
08:50	<godog>	restart mailman on sodium after inodes freed	[production]
07:27	<_joe_>	restarted mailman on sodium	[production]
07:22	<_joe_>	stopping mailman on sodium for repairing	[production]
06:54	<_joe_>	killed jenkins stale process on gallium, stuck in a futex while shutting down	[production]
04:48	<springle>	db1035 crash cycle. down for memtest and stuff	[production]