production SAL

4301-4350 of 10000 results (37ms)

2015-12-04 §
15:33	<godog>	ms-be2019 rebooted by itself, ilo event log shows "Uncorrectable Machine Check Exception (Board 0, Processor 2, APIC ID 0x00000038, Bank 0x00000003, Status 0xFE000040'00020135, Address 0x00000000'FEB82F63, Misc 0x00000000'00002285)"	[production]
08:52	<godog>	reimage restbase1009	[production]
05:59	<gwicke>	ran systemctl mask cassandra on restbase1009; it is important that this node does not start up.	[production]
05:53	<gwicke>	moved /var/lib/cassandra out of the way in an attempt to stop puppet restarting cassandra on decommissioned restbase1009	[production]
05:49	<l10nupdate@tin>	ResourceLoader cache refresh completed at Fri Dec 4 05:49:46 UTC 2015 (duration 3h 21m 36s)	[production]
02:28	<mwdeploy@tin>	sync-l10n completed (1.27.0-wmf.7) (duration: 10m 19s)	[production]
02:15	<ori>	CirrusSearch-common.php sync was for I826d000ca: Turn off backoff throttling of CirrusSearch jobs	[production]
02:15	<ori@tin>	Synchronized wmf-config/CirrusSearch-common.php: (no message) (duration: 00m 29s)	[production]
01:33	<bd808>	Updated scholarships.wikimedia.org to af73bf6	[production]
00:35	<catrope@tin>	Synchronized php-1.27.0-wmf.7/extensions/CentralNotice: SWAT (duration: 00m 32s)	[production]
2015-12-03 §
23:09	<bblack>	restarting pybal (w/ BGP enabled) on lvs100[123] (newly-installed w/ jessie)	[production]
22:59	<ori@tin>	Synchronized php-1.27.0-wmf.7/includes/jobqueue/JobRunner.php: temporarily disable job throttling (duration: 00m 29s)	[production]
22:08	<bd808>	Removed zirconium.wikimedia.org from Trebuchet minions list for scholarships/scholarships	[production]
22:04	<bd808>	Updated scholarships.wikimedia.org to cb94319 plus local i18n filtering	[production]
21:48	<Reedy>	finished removing bogus msg_resource rows	[production]
21:28	<oblivian@tin>	Synchronized wmf-config/CommonSettings.php: re-sync (re-merged the change) (duration: 00m 29s)	[production]
21:27	<bd808>	Applied database migrations and purged last year's data from Wikimania Scholarships db	[production]
21:21	<ottomata>	restarted eventlogging with 4 mysql consumer processes running in parallel	[production]
21:21	<bblack>	rebooting lvs100[123] for reinstall to jessie	[production]
21:18	<Reedy>	Cleaning up msg_resource rows with bogus language codes	[production]
21:15	<gwicke>	stopped cassandra on 1009 as it's decommissioned & will be reimaged	[production]
21:13	<oblivian@tin>	Synchronized wmf-config/CommonSettings.php: Re-fix the jobqueue on wikitech after redis cleanup (duration: 00m 26s)	[production]
20:55	<oblivian@tin>	Synchronized wmf-config/CommonSettings.php: Fix the jobqueue on wikitech (duration: 00m 47s)	[production]
20:45	<_joe_>	opening connection from mw1001 to silver, mysql	[production]
20:29	<ori>	on palladium: salt -G 'cluster:jobrunner' cmd.run 'service jobrunner status \| grep running && service jobrunner restart' ; salt -G 'cluster:jobrunner' cmd.run 'service jobchron status \| grep running && service jobchron restart'	[production]
20:28	<ori>	ran srem jobqueue:aggregator:s-wikis:v2 labswiki on rdb1001 aggr	[production]
19:41	<bblack>	disabling pybal on lvs100[123] over the next few minutes (for reinstall to jessie later after confirmation everything is still ok on [456])	[production]
19:10	<jynus>	restarting eventlogging_sync on db1047 and dbstore1002	[production]
19:04	<jynus>	starting m4 slave again on dbstore2002	[production]
18:45	<andrewbogott>	disabling puppet on labcontrol1002 to test openldap with pdns	[production]
18:33	<mutante>	neon - remove icinga user from "dialout" group	[production]
18:27	<jynus>	disabling eventlogging_sync process on dbstore1002 and db1047 and replication on the other m4 slaves	[production]
18:18	<jynus>	disabling event scheduler on db1046 (m4-master)	[production]
17:03	<kartik@tin>	Finished scap: Update ContentTranslation (duration: 05m 52s)	[production]
16:57	<kartik@tin>	Started scap: Update ContentTranslation	[production]
16:50	<oblivian@tin>	Synchronized wmf-config/CommonSettings.php: Fix the jobqueue on wikitech (duration: 00m 28s)	[production]
15:23	<andrewbogott>	stopping pdns on labcontrol2001	[production]
15:11	<moritzm>	restarting cassandra on restbase100[56] (subsequently) to effect openjdk security update	[production]
14:57	<mobrovac>	restbase end of deployment of 262da91a	[production]
14:48	<mobrovac>	restbase start deployment of 262da91a	[production]
14:06	<moritzm>	installed dpkg updates across the cluster	[production]
11:35	<moritzm>	restarting cassandra on aqs cluster (subsequently) to effect openjdk security update	[production]
10:51	<jynus>	restarting, upgrading and general maintenance for es1013 (depooled)	[production]
10:36	<_joe_>	imported dh-python into precise/universe from the ubuntu cloud archive	[production]
10:26	<jynus@tin>	Synchronized wmf-config/db-eqiad.php: Depool es1013 for maintenance (duration: 00m 30s)	[production]
05:50	<l10nupdate@tin>	ResourceLoader cache refresh completed at Thu Dec 3 05:50:08 UTC 2015 (duration 50m 7s)	[production]
02:25	<mwdeploy@tin>	sync-l10n completed (1.27.0-wmf.7) (duration: 09m 54s)	[production]
2015-12-02 §
22:09	<jynus>	unscheduled restart of dbstore1002 (analytics-slave)	[production]
21:44	<jynus>	disabling all alert notifications for dbstore1002	[production]
21:30	<bblack>	rebooting lvs1007 for interface config test (not active, no BGP)	[production]