production SAL

8401-8450 of 10000 results (32ms)

2017-06-26 §
12:58	<marostegui>	Deploy alter table on db2062 and db2055 - T168661	[production]
12:55	<elukey>	reboot mw129[5,6,7,8] for kernel update (mw imagescalers, two at the time)	[production]
12:02	<marostegui>	Deploy alter table on s2 codfw master (db2017) and let it replicate - T168661	[production]
11:05	<godog>	roll-restart pybal in codfw to pick up thumbor.svc.codfw.wmnet	[production]
10:28	<elukey>	reboot mw1288->90 for kernel updates (last batch of api-appservers)	[production]
10:18	<elukey>	reboot mw128[4,5,6,7] for kernel updates (api-appservers)	[production]
10:03	<godog>	roll-restart nginx on thumbor to disable te: chunked	[production]
09:34	<elukey>	reboot mw128[0,1,2,3] for kernel updates (api-appservers)	[production]
09:04	<elukey>	reboot mw127[6,7,8,9] for kernel updates (api-appservers)	[production]
08:58	<elukey>	reboot mw127[3,4,5] for kernel updates (appservers)	[production]
08:50	<gehel>	starting restart of elasticsearch codfw for kernel upgrade	[production]
08:48	<elukey>	reboot mw1269 -> mw1272 for kernel updates (appservers)	[production]
08:37	<godog>	roll-restart swift-proxy to use thumbor for commons	[production]
08:28	<elukey>	reboot mw1258, 126[6,7,8] for kernel updates (appservers)	[production]
08:11	<elukey>	reboot mw125[4,5,6,7] for kernel updates (appservers)	[production]
07:55	<marostegui>	Stop replication on db1069:3313 (s3) and db1044 in the same position - T166546	[production]
07:15	<elukey>	restart pdfrender on scb1002 for the xpra issue	[production]
07:08	<elukey>	powercycle elastic1017 (stuck in console, no ssh access)	[production]
06:57	<marostegui>	Drop table wikilove_image_log from silver - T127219	[production]
06:56	<elukey>	truncated neutron-server.log files in /var/log on labtestnet2001 to free some space in root	[production]
06:55	<marostegui>	Drop table wikilove_image_log from s1 - T127219	[production]
06:51	<marostegui>	Drop table wikilove_image_log from s3 - T127219	[production]
06:50	<elukey>	execute sudo -u _graphite find /var/lib/carbon/whisper/eventstreams/rdkafka -type f -mtime +15 -delete on graphite1001 to free some space for /var/lib/carbon	[production]
06:49	<marostegui>	Drop table wikilove_image_log from s7 - T127219	[production]
06:47	<marostegui>	Drop table wikilove_image_log from s2 - T127219	[production]
06:45	<marostegui>	Drop table wikilove_image_log from s4 - T127219	[production]
06:44	<marostegui>	Drop table wikilove_image_log from s6 - T127219	[production]
06:36	<marostegui>	Deploy alter table s7 - db1086 - T166208	[production]
06:35	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1086 - T166208 (duration: 00m 46s)	[production]
06:26	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Remove comments from db1041 long running alter status - T166208 (duration: 00m 47s)	[production]
03:01	<l10nupdate@tin>	ResourceLoader cache refresh completed at Mon Jun 26 03:01:35 UTC 2017 (duration 6m 52s)	[production]
02:54	<l10nupdate@tin>	scap sync-l10n completed (1.30.0-wmf.6) (duration: 08m 04s)	[production]
02:27	<l10nupdate@tin>	scap sync-l10n completed (1.30.0-wmf.5) (duration: 08m 03s)	[production]
2017-06-25 §
09:00	<elukey>	Executing 'sudo -u _graphite find /var/lib/carbon/whisper/eventstreams/rdkafka -type f -mtime +15 -delete' on graphite1001 to free some space (/var/lib/carbon filling up) - T1075	[production]
2017-06-23 §
23:42	<akosiaris>	bounce celery-ores-worker on scb1004	[production]
19:38	<ppchelko@tin>	Finished deploy [changeprop/deploy@ffabd13]: Re-enable ORES rules back (duration: 01m 07s)	[production]
19:37	<ppchelko@tin>	Started deploy [changeprop/deploy@ffabd13]: Re-enable ORES rules back	[production]
19:34	<akosiaris>	restart celery-ores-workers on scb1001, scb1002, scb1003, leave scb1004 alone	[production]
18:39	<godog>	roll restart celery-ores-worker in codfw	[production]
17:01	<mobrovac@tin>	Finished deploy [changeprop/deploy@1f45fae]: Temporary disable ORES (ongoing outage) (duration: 01m 19s)	[production]
16:59	<mobrovac@tin>	Started deploy [changeprop/deploy@1f45fae]: Temporary disable ORES (ongoing outage)	[production]
16:44	<mobrovac>	scb1001 disabling puppet	[production]
16:34	<akosiaris>	restart celery ores worker on scb1003	[production]
15:54	<hashar_>	Restarted Jenkins	[production]
15:45	<godog>	bounce celery-ores-worker on scb1001 with logging level INFO	[production]
13:51	<akosiaris>	issue flashdb on oresrdb1001:6379	[production]
13:21	<akosiaris>	issue flashdb on oresrdb1001:6379	[production]
13:13	<akosiaris>	bump uwsgi-ores and celery-ores-worker on scb100*	[production]
12:38	<akosiaris>	disable changeprop due to ORES issues	[production]
12:26	<Amir1>	restarting celery and uwsgi on all scb nodes in eqiad	[production]