production SAL

4401-4450 of 10000 results (49ms)

2018-01-10 §
09:12	<moritzm>	rebooting radium (tor relay) for kernel security update	[production]
08:42	<marostegui>	Stop replication in sync on db1089 and db1067 - T162807	[production]
08:41	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 01m 05s)	[production]
08:38	<marostegui>	Deploy schema change on s5 dbstore1001 - T174569	[production]
08:33	<moritzm>	rebooting mw1299-mw1306 (job runners) for kernel security update (along with update to HHVM 3.18.6)	[production]
08:28	<hashar>	contint1001: upgraded Zuul 2.5.0-8-gcbc7f62-wmf4jessie1 .. 2.5.0-8-gcbc7f62-wmf6 \| T158243	[production]
08:13	<marostegui>	Deploy schema change on s5 dbstore1002 - T174569	[production]
07:44	<moritzm>	rebooting mw1262-mw1275 for kernel security update (along with update to HHVM 3.18.6)	[production]
07:37	<marostegui>	Drop external_user from wikidatawiki - T184247	[production]
06:17	<marostegui>	Deploy schema change on s5 codfw master (db2052) with replication (this will generate lag on codfw) - T174569	[production]
02:24	<l10nupdate@tin>	scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 02s)	[production]
01:39	<mutante>	mw1226 - high load - hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1739PST.log ; restart-hhvm	[production]
00:43	<mutante>	rebooting gerrit server for kernel upgrade	[production]
00:18	<mutante>	rebooting phabricator server for kernel upgrade	[production]
2018-01-09 §
22:52	<godog>	ms-be1033 truncate unrotated and big server.log	[production]
22:22	<aaron@tin>	Synchronized php-1.31.0-wmf.16/includes/Setup.php: 68b4bbfbc12c626 (duration: 01m 15s)	[production]
22:20	<mutante>	netmon2001 - arming keyholder for rancid	[production]
21:10	<mepps>	updated SmashPig from 45aa62650c to 778e8f87b4	[production]
20:57	<twentyafterfour@tin>	Finished scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2) (duration: 36m 34s)	[production]
20:21	<twentyafterfour@tin>	Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2)	[production]
20:14	<twentyafterfour@tin>	scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="test2wiki" --outdir="/tmp/scap_l10n_3984299293" --threads=10 --lang en --quiet' returned non-zero exit status 1 (duration: 02m 44s)	[production]
20:13	<mutante>	netmon2001 - rebooting	[production]
20:12	<twentyafterfour@tin>	Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749	[production]
20:04	<mutante>	gerrit2001 - rebooting	[production]
20:00	<mutante>	phab2001 - reboot for upgrade	[production]
19:20	<mepps>	rolledback SmashPig from 0c45b1a684 to 45aa62650c	[production]
19:07	<mepps>	updated SmashPig from 45aa62650c to 0c45b1a684	[production]
18:42	<mutante>	ms-fe3002,ms-fe3001 - powering down, removing from puppet and icinga, ms-be* removing from puppet/icinga (T169518)	[production]
18:38	<mutante>	ms-fe3001 - shutting down for decom, removed from puppet	[production]
18:38	<mutante>	mw1227 still not showing recovery, using restart-hhvm	[production]
18:29	<mutante>	mw1227 killed it one more time and also restarted apache.. now load going down	[production]
18:26	<mutante>	mw1227 hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1024PST.log ; then killed hhvm and restarted it with systemctl	[production]
17:56	<twentyafterfour>	MediaWiki Train: Branching 1.31.0-wmf.16	[production]
17:41	<moritzm>	rebooting image scalers in codfw for kernel security update (along with HHVM update)	[production]
17:30	<volans>	re-enabled Icinga event handlers on RAID checks for lvs3001	[production]
17:17	<ema>	failover traffic back to lvs3001, raid rebuilt	[production]
17:15	<godog>	depool restbase cassandra 2 nodes - T184100	[production]
16:35	<cmjohnson1>	disabling pupppet for decom on mw1180-1200	[production]
16:28	<volans>	disabled Icinga event handlers on RAID checks for lvs3001, WIP on the host	[production]
16:18	<gehel>	starting cluster reboot for elasticsearch / cirrus codfw	[production]
16:09	<bd808>	data-services: added s8.{analytics,web}.db.svc.eqiad.wmflabs and aliases (T181643, T184179)	[production]
16:09	<elukey>	re-started mysql on dbstore1002 (and slave replication) after hw maintenance	[production]
15:44	<godog>	roll-restart swift frontends in codfw and eqiad	[production]
15:40	<akosiaris@tin>	Finished deploy [servermon/servermon@10e165e]: Testing scap check (duration: 00m 02s)	[production]
15:40	<akosiaris@tin>	Started deploy [servermon/servermon@10e165e]: Testing scap check	[production]
15:31	<gehel>	reboot maps-test* for kernel upgrade	[production]
15:30	<elukey>	stop mysql on dbstore1002 as prep step for shutdown (stop all slaves, mysql stop)	[production]
15:23	<herron>	puppet master reboots complete. re-enabling puppet agents	[production]
15:18	<ema>	lvs3001 disk swap: failover traffic to lvs3003 T166965	[production]
15:10	<elukey>	reboot analytics1028 (hadoop worker and hdfs journal node) for kernel updates	[production]