production SAL

3701-3750 of 10000 results (28ms)

2014-06-09 §
20:42	<manybubbles>	upgraded elastic1007-elastic1010 without issue - starting elastic1010	[production]
20:08	<subbu>	deployed Parsoid 9b673587 (deploy sha 7d0097a1)	[production]
19:23	<ottomata>	disabling puppet on analytics1012	[production]
19:00	<ottomata>	decomissioning analytics1012 in hadoop cluster, this will become a Kafka broker	[production]
17:59	<manybubbles>	elastic1004-1006 upgraded without trouble - cluster is working on filling elatic1006 before moving on to 1007, and the rest	[production]
17:04	<andrewbogott>	switching labs to puppet3	[production]
17:03	<awight>	update crm from b38497a9d0ef75fe2b20b03b649ac13a5e3f47a7 to b6815d29de97b80a0ab65db576213a604f0c7cb9	[production]
16:30	<manybubbles>	upgrading elastic1003 - upgrade is going well so far so I'm going to stop watching it as closely and let it be more automated	[production]
15:28	<manybubbles>	elastic1001 went well, doing 1002 by hand again	[production]
15:17	<anomie>	Synchronized php-1.24wmf8/extensions/Wikidata: SWAT: Wikidata entity suggester bug fixes [[gerrit:138339]] (duration: 00m 16s)	[production]
15:12	<greg-g>	mw1151 still "permission denied" during deploys	[production]
15:12	<anomie>	Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable TemplateData GUI on Portuguese Wikipedia [[gerrit:137986]] (duration: 00m 14s)	[production]
15:09	<anomie>	Synchronized php-1.24wmf7/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWSaveDialog.js: SWAT: VE fix for focus regression [[gerrit:137978]] (duration: 00m 15s)	[production]
15:06	<andrewbogott>	beta updating all instances to puppet 3 via a cherry-pick of https://gerrit.wikimedia.org/r/#/c/137898/ on deployment-salt	[production]
15:05	<anomie>	Synchronized php-1.24wmf8/extensions/VisualEditor/modules/ve-mw/: SWAT: VE fix for focus regression and alignment issues [[gerrit:137971]] [[gerrit:138122]] (duration: 00m 14s)	[production]
15:01	<manybubbles>	successfully synced plugins, upgrading elastic1001 to make sure everything is working ok with it - then we'll run through the others more quickly	[production]
14:57	<manybubbles>	syncing elasticsearch plugins for 1.2.1 - any elasticsearch restart from here on out needs to come with 1.2.1 or the node will break.	[production]
14:54	<manybubbles>	starting Elasticsearch upgrade with elastic1001	[production]
07:14	<springle>	disabled puppet on analytics1021 to avoid kafka broker restarting with missing mount	[production]
05:15	<springle>	xtrabackup clone db1046 to db1020	[production]
04:44	<springle>	umount /dev/sdf on analytics1021, fs in r/o mode, kafka broker not running. no checks yet	[production]
03:24	<LocalisationUpdate>	ResourceLoader cache refresh completed at Mon Jun 9 03:23:05 UTC 2014 (duration 23m 4s)	[production]
02:29	<LocalisationUpdate>	completed (1.24wmf8) at 2014-06-09 02:28:08+00:00	[production]
02:15	<LocalisationUpdate>	completed (1.24wmf7) at 2014-06-09 02:14:46+00:00	[production]
2014-06-08 §
23:27	<p858snake\|l>	icinga has been shitting in the channel for 9+ hours (before I went to bed) about Varnishkafka, nothing noted in SAL. Here be a note about it.	[production]
03:22	<LocalisationUpdate>	ResourceLoader cache refresh completed at Sun Jun 8 03:21:28 UTC 2014 (duration 21m 27s)	[production]
02:28	<LocalisationUpdate>	completed (1.24wmf8) at 2014-06-08 02:27:21+00:00	[production]
02:15	<LocalisationUpdate>	completed (1.24wmf7) at 2014-06-08 02:14:10+00:00	[production]
2014-06-07 §
23:48	<hoo>	Fixed four CentralAuth log entries on meta which were logged for WikiSets/0	[production]
21:36	<manybubbles>	that means I turned off puppet and shut down Elasticsearch on elastic1017 - you can expect the cluster to go yellow for half an hour or so while the other nodes take rebuild the redundency that elastic1017 had	[production]
21:35	<manybubbles>	after consulting logs - elastic1017 has had high io wait since it was deployed - I'm taking it out of rotation	[production]
21:31	<manybubbles>	elastic1017 is sick - thrashing to death on io - restarting Elasticsearch to see if it recovers unthrashed	[production]
17:56	<godog>	restarted ES on elastic1017.eqiad.wmnet (at 17:22 UTC)	[production]
03:24	<LocalisationUpdate>	ResourceLoader cache refresh completed at Sat Jun 7 03:23:32 UTC 2014 (duration 23m 31s)	[production]
02:31	<LocalisationUpdate>	completed (1.24wmf8) at 2014-06-07 02:29:57+00:00	[production]
02:17	<LocalisationUpdate>	completed (1.24wmf7) at 2014-06-07 02:16:30+00:00	[production]
2014-06-06 §
23:51	<Krinkle>	Restarted Jenkins, force stopped Zuul, started Zuul, configure Jenkins via web interface (disable Gearman, save, enable German); Seems to be back up now, finally.	[production]
22:52	<mutante>	same for rhenium, titanium, bast1001, calcium, carbon, ytterbium, stat1003	[production]
22:43	<RoanKattouw>	Restarting Jenkins didn't help, jobs still aren't making it across from Zuul into Jenkins	[production]
22:36	<RoanKattouw>	Restarting stuck Jenkins	[production]
22:35	<mutante>	same for holmium, hafnium, silver, netmon1001, magnesium, neon, antimony	[production]
22:17	<mutante>	upgraded ssl packages on zirconium	[production]
21:57	<Krinkle>	Took Jenkins slave on gallium temporarily offline and back online to resolve possible stagnation	[production]
20:56	<awight_>	updated crm from ded541894a70922e098fb3ea48306c8ec0f0f6aa to b38497a9d0ef75fe2b20b03b649ac13a5e3f47a7	[production]
18:25	<mwalker>	updating payments from e823354822c7a35e6c2069d3e72180a45dbc89dc to b4c5cf1bceb70d65eae28cdd0873036dc33c8992 for globalcollect oid hack	[production]
14:04	<hashar>	Gerrit back. chase rebooted it :)	[production]
13:55	<hashar>	Gerrit having some troubles: error: RPC failed; result=22, HTTP code = 503 (while cloning CirrusSearch )	[production]
12:58	<cmjohnson1>	replacing raid controller db1020	[production]
06:12	<Tim>	on osmium installed nodejs for testing	[production]
04:24	<LocalisationUpdate>	ResourceLoader cache refresh completed at Fri Jun 6 04:23:08 UTC 2014 (duration 23m 7s)	[production]