production SAL

101-150 of 10000 results (23ms)

2015-12-22 §
15:53	<mutante>	kafka1001,1002 - crit - eventlogging not running (?)	[production]
15:52	<mutante>	restbase1003 - disk space, restbase1008 - disk space, restbase1004 - cassandra cql refused	[production]
15:23	<akosiaris>	upgrade cassandra on maps-test2003	[production]
15:06	<jynus>	restarting and reconfiguring mysql at dbstore2001	[production]
15:06	<mutante>	labtestcontrol2001 - puppet had not been running for a while, a bunch of changes have been applied incl. keys and passwords	[production]
15:04	<mutante>	enabling puppet on labtestcontrol2001	[production]
15:04	<akosiaris>	upgraded cassandra on maps-test2004	[production]
11:54	<apergos>	salt packages with wmf packages precise running on ms-{bf}e* in esams; trusty running on analytics103* in eqiad; jessie running on restbase2* in codfw	[production]
11:43	<godog>	restart cassandra bootstrap on restbase1004	[production]
10:09	<jynus>	online resizing /srv/postgres on labsdb1006 +100GB	[production]
10:06	<hashar>	Restarting Jenkins	[production]
09:54	<apergos>	precise and trusty salt packages with wmf patches deployed manually on dataset1001 and analytics1001, seem to work fine	[production]
08:42	<jynus>	restarting and reconfiguring mysql at db2036	[production]
02:30	<l10nupdate@tin>	ResourceLoader cache refresh completed at Tue Dec 22 02:30:28 UTC 2015 (duration 6m 54s)	[production]
02:23	<mwdeploy@tin>	sync-l10n completed (1.27.0-wmf.9) (duration: 09m 47s)	[production]
00:29	<krenair@tin>	Synchronized php-1.27.0-wmf.9/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/260492/ (duration: 00m 32s)	[production]
00:22	<krenair@tin>	Synchronized php-1.27.0-wmf.9/extensions/SyntaxHighlight_GeSHi/modules/ve-syntaxhighlight/ve.ui.MWSyntaxHighlightDialogTool.js: https://gerrit.wikimedia.org/r/#/c/260429/ (duration: 00m 30s)	[production]
2015-12-21 §
20:49	<godog>	restbase1004 bootstrap failed, restbase1007-a is down java.lang.RuntimeException: A node required to move the data consistently is down (/10.64.0.230).	[production]
19:27	<legoktm>	running checkLocalUser.php --delete=1 for real this time on terbium	[production]
19:22	<godog>	reimage restbase1004	[production]
19:14	<paravoid>	powercycling mw1011	[production]
19:11	<paravoid>	rolling restart of hhvm on the eqiad jobrunners	[production]
18:47	<jynus>	common-sync: Copying to mw1016.eqiad.wmnet from tin.eqiad.wmnet	[production]
18:35	<ori>	correction: previous log message was for mw1015, not mw1017	[production]
18:27	<ori>	mw1017: enabled jemalloc profiling, restarted hhvm, now running hhvm-collect-heaps	[production]
17:48	<akosiaris>	restarted hhvm on mw1012.eqiad.wmnet	[production]
16:57	<thcipriani>	timeout on sync-file to mw1016.eqiad.wmnet	[production]
16:56	<thcipriani@tin>	Synchronized php-1.27.0-wmf.9/extensions/Popups/Popups.hooks.php: SWAT: Use ExtensionRegistry to determine whether TextExtracts is installed [[gerrit:260346]] (duration: 02m 48s)	[production]
16:34	<jynus>	sync-common to mw1085	[production]
16:26	<jynus>	powercycling mw1085.eqiad.wmnet	[production]
16:22	<thcipriani>	mw1085.eqiad.wmnet times out on SSH connection	[production]
16:19	<godog>	reboot restbase1007, load through the roof	[production]
16:18	<thcipriani@tin>	Synchronized php-1.27.0-wmf.9/extensions/CentralNotice/resources/subscribing/ext.centralNotice.geoIP.js: SWAT: Update CentralNotice [[gerrit:260316]] (duration: 03m 03s)	[production]
16:08	<godog>	depool restbase1007	[production]
16:01	<apergos>	jessie packages for salt with local patches deployed on restbase1001, looks fine but just in case.	[production]
15:44	<godog>	adding new 1TB disk to restbase1007	[production]
14:22	<andrewbogott>	disabling puppet on labnet1002 for dnsmasq tests	[production]
14:07	<MaxSem>	me and yurik are nuking old maps data and reimporting planet	[production]
13:46	<jynus>	extending online s2-master data disk by +100GB	[production]
13:15	<akosiaris>	disabled puppet on maps-test2001 and commented out osmupdater crontab entry until we fix the sync process	[production]
11:02	<jynus>	emergency restart of db1047's mysql	[production]
09:54	<jynus>	reenabling semisync replication on s3	[production]
09:07	<godog>	stop cassandra on restbase1004, decomissioned	[production]
02:29	<l10nupdate@tin>	ResourceLoader cache refresh completed at Mon Dec 21 02:29:51 UTC 2015 (duration 6m 47s)	[production]
02:23	<mwdeploy@tin>	sync-l10n completed (1.27.0-wmf.9) (duration: 09m 45s)	[production]
02:20	<andrewbogott>	disabling puppet on labnet1002 to mess with dnsmasq	[production]
01:44	<andrewbogott>	disabled puppet on holmium and labservices1001 to control roll-out of https://gerrit.wikimedia.org/r/#/c/260037/	[production]
2015-12-20 §
23:24	<Reedy>	Katie and Jeff paged about bellatrix	[production]
18:46	<andrewbogott>	graceful restart of zuul as per https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Restart	[production]
18:31	<andrewbogott>	restarting stuck Jenkins	[production]