__all__ SAL

7751-7800 of 10000 results (60ms)

2016-01-27 §
21:19	<cscott>	updated OCG to version 64050af0456a43344b32e3e93561a79207565eaf (should be no-op after yesterday's deploy)	[releng]
20:24	<chasemp>	master stop, truncate accounting log to accounting.01272016, master start	[tools]
19:48	<YuviPanda>	started nfs-exports daemon on labstore1001, had been dead for a few days	[production]
19:34	<chasemp>	master start grid master	[tools]
19:31	<mutante>	stat1002 - redis.exceptions.ConnectionError: Error connecting to mira.codfw.wmnet:6379. timed out.	[production]
19:31	<mutante>	stat1002 - running puppet, was reported as last run about 4 hours ago but not deactivated	[production]
19:23	<chasemp>	stopped master	[tools]
19:14	<dduvall@mira>	rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.11	[production]
19:11	<YuviPanda>	depooled tools-webgrid-1405 to prep for restart, lots of stuck processes	[tools]
18:49	<jynus@mira>	Synchronized wmf-config/db-eqiad.php: Repool pc1006 after cloning (duration: 02m 25s)	[production]
18:48	<bd808>	HHVM on mw1019 still dying on a regular basis with "Lost parent, LightProcess exiting"	[production]
18:29	<valhallasw`cloud>	job 2551539 is ifttt, which is also running as 2700629. Killing 2551539 .	[tools]
18:26	<valhallasw`cloud>	messages repeatedly reports "01/27/2016 18:26:17\|worker\|tools-grid-master\|E\|execd@tools-webgrid-generic-1405.tools.eqiad.wmflabs reports running job (2551539.1/master) in queue "webgrid-generic@tools-webgrid-generic-1405.tools.eqiad.wmflabs" that was not supposed to be there - killing". SSH'ing there to investigate	[tools]
18:24	<valhallasw`cloud>	'sleep' test job also seems to work without issues	[tools]
18:23	<valhallasw`cloud>	no errors in log file, qstat works	[tools]
18:23	<chasemp>	master sge restarted post dump and restart for jobs db	[tools]
18:22	<valhallasw`cloud>	messages file reports 'Wed Jan 27 18:21:39 UTC 2016 db_load_sge_maint_pre_jobs_dump_01272016'	[tools]
18:20	<chasemp>	master db_load -f /root/sge_maint_pre_jobs_dump_01272016 sge_job	[tools]
18:19	<valhallasw`cloud>	dumped jobs database to /root/sge_maint_pre_jobs_dump_01272016, 4.6M	[tools]
18:17	<valhallasw`cloud>	SGE Configuration successfully saved to /root/sge_maint_01272016 directory.	[tools]
18:14	<chasemp>	grid master stopped	[tools]
18:00	<csteipp>	deploy patch for T103239	[production]
17:50	<csteipp>	deploy patch for T97157	[production]
17:46	<jynus>	migrating ruthenium parsoid-test database to m5-master	[production]
17:27	<elukey>	rebooting analytics105* hosts to upgrade their kernel	[production]
17:16	<elukey>	rebooting analytics1035.eqiad.wmnet for kernel upgrade	[production]
16:22	<thcipriani@mira>	Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/CentralAuthUtils.php: SWAT: Preserve certain keys when updating central session [[gerrit:266672]] (duration: 02m 28s)	[production]
16:11	<thcipriani@mira>	Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: SWAT: Avoid forceHTTPS cookie flapping if core and CA are setting the same cookie [[gerrit:266671]] (duration: 02m 26s)	[production]
16:03	<elukey>	rebooting analytics 1043 -> 1050 for kernel upgrade.	[production]
15:47	<elukey>	rebooting analytics 1026, 1040 -> 1042 due to kernel upgrade.	[production]
14:58	<jynus>	cloning persercache contents from pc1003 to pc1006	[production]
14:45	<elukey>	rebooting analytics 1036 to 1039 for kernel upgrade	[production]
14:35	<elukey>	analytics 1035 hasn't been rebooted because it is a Hadoop Journal Node (will be restarted in the end)	[production]
14:04	<elukey>	rebooting analytics 1032 to 1035 for kernel upgrades	[production]
14:03	<jynus@mira>	Synchronized wmf-config/db-eqiad.php: Depool pc1003 for cloning to pc1006 (duration: 02m 30s)	[production]
13:59	<jynus>	about to going new hardware/OS/mariadb-only for parsercache service	[production]
13:32	<elukey>	rebooting analytics1030/1031 for kernel upgrade	[production]
13:15	<akosiaris>	rebooting fermium for kernel upgrades	[production]
13:10	<elukey>	rebooting analytics1029 for kernel upgrade	[production]
12:29	<moritzm>	rebooting analytics1028 for kernel update	[production]
10:29	<hashar>	triggered bunch of browser tests, deployment-redis01 was dead/faulty	[releng]
10:25	<ema>	restarting apache2 and hhvm on mw1119	[production]
10:08	<hashar>	mass restarting redis-server process on deployment-redis01 (for https://phabricator.wikimedia.org/T124677 )	[releng]
10:07	<hashar>	mass restarting redis-server process on deployment-redis01	[releng]
09:00	<hashar>	beta: commenting out "latency-monitor-threshold 100" parameter from any /etc/redis/redis.conf we have ( https://phabricator.wikimedia.org/T124677 ). Puppet will not reapply it unless distribution is Jessie	[releng]
03:19	<ebernhardson@mira>	Synchronized wmf-config/CirrusSearch-production.php: Correct invalid cirrus shard configuration (duration: 02m 59s)	[production]
02:55	<l10nupdate@tin>	ResourceLoader cache refresh completed at Wed Jan 27 02:55:21 UTC 2016 (duration 7m 13s)	[production]
02:48	<mwdeploy@tin>	sync-l10n completed (1.27.0-wmf.11) (duration: 10m 25s)	[production]
02:23	<mwdeploy@tin>	sync-l10n completed (1.27.0-wmf.10) (duration: 09m 51s)	[production]
01:59	<ori@mira>	Synchronized docroot and w: Icc4f6134b0: Add a speed experiment which inlines the top stylesheet (duration: 02m 28s)	[production]