__all__ SAL

4801-4850 of 10000 results (40ms)

2015-08-31 §
20:44	<chasemp>	ferm for elastic100[4-7] and adjust ferm to include wikitech source	[production]
20:29	<valhallasw`cloud>	\|sort is not so spread out in terms of affected hosts because a lot of jobs were started on lighttpd-1409 and -1410 around the same time.	[tools]
20:25	<valhallasw`cloud>	ca 500 jobs @ 5s/job = approx 40 minutes	[tools]
20:23	<valhallasw`cloud>	doh. accidentally used the wrong file, causing restarts for another few uwsgi hosts. Three more jobs dead sigh	[tools]
20:21	<valhallasw`cloud>	now doing more rescheduling, with 5 sec intervals, on a sorted list to spread load between queues	[tools]
20:21	<subbu>	deployed parsoid version c3e4df5e	[production]
19:36	<valhallasw`cloud>	last restarted job is 1423661, rest of them are still in /home/valhallaw/webgrid_jobs	[tools]
19:35	<valhallasw`cloud>	one per second still seems to make SGE unhappy; there's a whole set of jobs dying, mostly uwsgi?	[tools]
19:31	<valhallasw`cloud>	https://phabricator.wikimedia.org/T110861 : rescheduling 521 webgrid jobs, at a rate of one per second, while watching the accounting log for issues	[tools]
16:22	<godog>	depool mw1125 + mw1142 from api, nutcracker client connections exceeded	[production]
16:06	<thcipriani@tin>	Finished scap: SWAT: Ask the user to log in if the session is lost [[gerrit:234228]] (duration: 27m 07s)	[production]
15:59	<jynus>	restarting hhvm on mw2187	[production]
15:39	<thcipriani@tin>	Started scap: SWAT: Ask the user to log in if the session is lost [[gerrit:234228]]	[production]
15:33	<mutante>	terbium - Could not find dependent Service[nscd] for File[/etc/ldap/ldap.conf]	[production]
15:28	<thcipriani@tin>	Synchronized closed-labs.dblist: SWAT: Creating closed-labs.dblist and closing es.wikipedia.beta.wmflabs.org [[gerrit:234594]] (duration: 00m 13s)	[production]
15:25	<thcipriani@tin>	Synchronized wmf-config/CirrusSearch-common.php: SWAT: Remove files from Commons from search results on wikimediafoundation.org [[gerrit:234040]] (duration: 00m 11s)	[production]
15:25	<ottomata>	starting varnishkafka instances on frontend caches to produce eventlogging client side events to kafka	[production]
15:21	<thcipriani@tin>	Synchronized php-1.26wmf20/extensions/Wikidata: SWAT: Update Wikidata - Fix formatting of client edit summaries [[gerrit:234991]] (duration: 00m 21s)	[production]
15:16	<thcipriani@tin>	Synchronized php-1.26wmf20/extensions/UploadWizard/resources/controller/uw.controller.Step.js: SWAT: Keep the uploads sorted in the order they were created in initially [[gerrit:234553]] (duration: 00m 12s)	[production]
15:13	<jzerebecki>	did https://phabricator.wikimedia.org/T109007#1537572	[releng]
14:43	<ebernhardson>	elasticsearch cluster.routing.allocation.disk.watermark.high set to 75% to force elastic1022 to reduce its disk usage	[production]
14:41	<urandom>	bouncing Cassandra on restbase1001 to apply temporary GC setting	[production]
14:06	<akosiaris>	rebooted krypton. was reporting 100% cpu steal time	[production]
13:40	<paravoid>	running puppet on newly-installed mc2001	[production]
13:40	<paravoid>	restarting hhvm on mw1065	[production]
11:10	<moritzm>	restart salt-master on palladium	[production]
10:45	<paravoid>	reenabling asw2-a5-eqiad:xe-0/0/36 (T107635)	[production]
10:36	<godog>	repool ms-fe1004	[production]
10:32	<godog>	repool ms-fe1003 and depool ms-fe1004 for firewall changes	[production]
10:19	<godog>	update graphite retention policy on files with previous retention and older than 30d T96662	[production]
10:18	<godog>	repool ms-fe1002 and depool ms-fe1003 for firewall changes	[production]
10:05	<godog>	depool ms-fe1002 to apply firewall changes	[production]
09:55	<jynus>	cloning es1007 mysql data into es1013 (ETA: 5h30m)	[production]
09:51	<godog>	repool ms-fe1001	[production]
09:35	<godog>	depool ms-fe1001 in preparation for ferm changes	[production]
09:27	<godog>	update graphite retention policy on files with previous retention and older than 60d T96662	[production]
09:25	<jynus@tin>	Synchronized wmf-config/db-eqiad.php: Depool es1007 for maintenance (duration: 00m 13s)	[production]
08:33	<jynus@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1028, return ES servers back from maintenance (duration: 00m 12s)	[production]
07:31	<valhallasw`cloud>	removed paniclog on tools-submit; probably related to the NFS outage yesterday (although I'm not sure why that would give OOMs)	[tools]
04:34	<l10nupdate@tin>	ResourceLoader cache refresh completed at Mon Aug 31 04:34:14 UTC 2015 (duration 34m 13s)	[production]
04:05	<bblack>	disabled ipv6 autoconf on neon, flushed old dynamic addr	[production]
02:32	<l10nupdate@tin>	LocalisationUpdate completed (1.26wmf20) at 2015-08-31 02:32:25+00:00	[production]
02:29	<l10nupdate@tin>	Synchronized php-1.26wmf20/cache/l10n: l10nupdate for 1.26wmf20 (duration: 06m 42s)	[production]
2015-08-30 §
20:53	<hashar>	beta-scap-eqiad failling due to some mwdeploy not being able to ssh to other hosts. Attempted to add the ssh key again following https://phabricator.wikimedia.org/T109007#1537572 which fixed it	[releng]
14:38	<multichill>	Made local change to unused_images.py to get it to work, see https://phabricator.wikimedia.org/T110829	[tools.heritage]
13:24	<valhallasw`cloud>	force-restarting grrrit-wm	[tools.lolrrit-wm]
13:23	<valhallasw`cloud>	killed wikibugs-backup and grrrit-wm on tools-webproxy-01	[tools]
13:20	<valhallasw`cloud>	disabling 503 error page	[tools]
13:01	<YuviPanda>	rebooted tools-bastion-01 to see if that remounts NFS	[tools]
12:58	<godog>	lvchange -ay labstore/others on labstore1002	[production]