production SAL

1-50 of 10000 results (25ms)

2015-12-22 §
21:46	<gwicke>	restbase1004: tune2fs -m 0 /dev/mapper/restbase1004--vg-srv	[production]
21:45	<gwicke>	restbase1004: restarted bootstrap	[production]
21:22	<gwicke>	restbase1003: restarting cassandra to clear up disk space from old stream	[production]
21:11	<gwicke>	restbase1008: restarting cassandra to clear up disk space from old stream	[production]
18:36	<robh>	silver returned to normal service, wikitech.w.o certificate renewed.	[production]
18:26	<robh>	silver puppet staying stalled during toollabs issue (we dont want to rehup silver web serivce)	[production]
18:17	<robh>	puppet disabled on silver, going to update wikitech.wikimedia.org certificate	[production]
18:10	<jynus>	disabling event scheduling on db1046	[production]
18:03	<jynus>	rolling schema change (ALTER TABLE ENGINE=TokuDB) on m4-master (db1046) log (eventlogging)	[production]
16:44	<godog>	bounce cassandra on restbase1004, restart bootstrap	[production]
16:42	<mutante>	powercycling crashed mw1144	[production]
16:41	<jynus>	converting dbstore2001 (delayed slave) into an actual delayed slave, adding redundancy to dbstore1002	[production]
16:40	<godog>	bounce cassandra on restbase1003	[production]
16:15	<akosiaris>	upgrade cassandra on maps-test2001	[production]
16:15	<akosiaris>	upgrade cassandra on maps-test2002	[production]
15:53	<mutante>	kafka1001,1002 - crit - eventlogging not running (?)	[production]
15:52	<mutante>	restbase1003 - disk space, restbase1008 - disk space, restbase1004 - cassandra cql refused	[production]
15:23	<akosiaris>	upgrade cassandra on maps-test2003	[production]
15:06	<jynus>	restarting and reconfiguring mysql at dbstore2001	[production]
15:06	<mutante>	labtestcontrol2001 - puppet had not been running for a while, a bunch of changes have been applied incl. keys and passwords	[production]
15:04	<mutante>	enabling puppet on labtestcontrol2001	[production]
15:04	<akosiaris>	upgraded cassandra on maps-test2004	[production]
11:54	<apergos>	salt packages with wmf packages precise running on ms-{bf}e* in esams; trusty running on analytics103* in eqiad; jessie running on restbase2* in codfw	[production]
11:43	<godog>	restart cassandra bootstrap on restbase1004	[production]
10:09	<jynus>	online resizing /srv/postgres on labsdb1006 +100GB	[production]
10:06	<hashar>	Restarting Jenkins	[production]
09:54	<apergos>	precise and trusty salt packages with wmf patches deployed manually on dataset1001 and analytics1001, seem to work fine	[production]
08:42	<jynus>	restarting and reconfiguring mysql at db2036	[production]
02:30	<l10nupdate@tin>	ResourceLoader cache refresh completed at Tue Dec 22 02:30:28 UTC 2015 (duration 6m 54s)	[production]
02:23	<mwdeploy@tin>	sync-l10n completed (1.27.0-wmf.9) (duration: 09m 47s)	[production]
00:29	<krenair@tin>	Synchronized php-1.27.0-wmf.9/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/260492/ (duration: 00m 32s)	[production]
00:22	<krenair@tin>	Synchronized php-1.27.0-wmf.9/extensions/SyntaxHighlight_GeSHi/modules/ve-syntaxhighlight/ve.ui.MWSyntaxHighlightDialogTool.js: https://gerrit.wikimedia.org/r/#/c/260429/ (duration: 00m 30s)	[production]
2015-12-21 §
20:49	<godog>	restbase1004 bootstrap failed, restbase1007-a is down java.lang.RuntimeException: A node required to move the data consistently is down (/10.64.0.230).	[production]
19:27	<legoktm>	running checkLocalUser.php --delete=1 for real this time on terbium	[production]
19:22	<godog>	reimage restbase1004	[production]
19:14	<paravoid>	powercycling mw1011	[production]
19:11	<paravoid>	rolling restart of hhvm on the eqiad jobrunners	[production]
18:47	<jynus>	common-sync: Copying to mw1016.eqiad.wmnet from tin.eqiad.wmnet	[production]
18:35	<ori>	correction: previous log message was for mw1015, not mw1017	[production]
18:27	<ori>	mw1017: enabled jemalloc profiling, restarted hhvm, now running hhvm-collect-heaps	[production]
17:48	<akosiaris>	restarted hhvm on mw1012.eqiad.wmnet	[production]
16:57	<thcipriani>	timeout on sync-file to mw1016.eqiad.wmnet	[production]
16:56	<thcipriani@tin>	Synchronized php-1.27.0-wmf.9/extensions/Popups/Popups.hooks.php: SWAT: Use ExtensionRegistry to determine whether TextExtracts is installed [[gerrit:260346]] (duration: 02m 48s)	[production]
16:34	<jynus>	sync-common to mw1085	[production]
16:26	<jynus>	powercycling mw1085.eqiad.wmnet	[production]
16:22	<thcipriani>	mw1085.eqiad.wmnet times out on SSH connection	[production]
16:19	<godog>	reboot restbase1007, load through the roof	[production]
16:18	<thcipriani@tin>	Synchronized php-1.27.0-wmf.9/extensions/CentralNotice/resources/subscribing/ext.centralNotice.geoIP.js: SWAT: Update CentralNotice [[gerrit:260316]] (duration: 03m 03s)	[production]
16:08	<godog>	depool restbase1007	[production]
16:01	<apergos>	jessie packages for salt with local patches deployed on restbase1001, looks fine but just in case.	[production]