751-800 of 10000 results (20ms)
2014-06-09 §
17:59 <manybubbles> elastic1004-1006 upgraded without trouble - cluster is working on filling elatic1006 before moving on to 1007, and the rest [production]
17:04 <andrewbogott> switching labs to puppet3 [production]
17:03 <awight> update crm from b38497a9d0ef75fe2b20b03b649ac13a5e3f47a7 to b6815d29de97b80a0ab65db576213a604f0c7cb9 [production]
16:30 <manybubbles> upgrading elastic1003 - upgrade is going well so far so I'm going to stop watching it as closely and let it be more automated [production]
15:28 <manybubbles> elastic1001 went well, doing 1002 by hand again [production]
15:17 <anomie> Synchronized php-1.24wmf8/extensions/Wikidata: SWAT: Wikidata entity suggester bug fixes [[gerrit:138339]] (duration: 00m 16s) [production]
15:12 <greg-g> mw1151 still "permission denied" during deploys [production]
15:12 <anomie> Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable TemplateData GUI on Portuguese Wikipedia [[gerrit:137986]] (duration: 00m 14s) [production]
15:09 <anomie> Synchronized php-1.24wmf7/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWSaveDialog.js: SWAT: VE fix for focus regression [[gerrit:137978]] (duration: 00m 15s) [production]
15:06 <andrewbogott> beta updating all instances to puppet 3 via a cherry-pick of https://gerrit.wikimedia.org/r/#/c/137898/ on deployment-salt [production]
15:05 <anomie> Synchronized php-1.24wmf8/extensions/VisualEditor/modules/ve-mw/: SWAT: VE fix for focus regression and alignment issues [[gerrit:137971]] [[gerrit:138122]] (duration: 00m 14s) [production]
15:01 <manybubbles> successfully synced plugins, upgrading elastic1001 to make sure everything is working ok with it - then we'll run through the others more quickly [production]
14:57 <manybubbles> syncing elasticsearch plugins for 1.2.1 - any elasticsearch restart from here on out needs to come with 1.2.1 or the node will break. [production]
14:54 <manybubbles> starting Elasticsearch upgrade with elastic1001 [production]
07:14 <springle> disabled puppet on analytics1021 to avoid kafka broker restarting with missing mount [production]
05:15 <springle> xtrabackup clone db1046 to db1020 [production]
04:44 <springle> umount /dev/sdf on analytics1021, fs in r/o mode, kafka broker not running. no checks yet [production]
03:24 <LocalisationUpdate> ResourceLoader cache refresh completed at Mon Jun 9 03:23:05 UTC 2014 (duration 23m 4s) [production]
02:29 <LocalisationUpdate> completed (1.24wmf8) at 2014-06-09 02:28:08+00:00 [production]
02:15 <LocalisationUpdate> completed (1.24wmf7) at 2014-06-09 02:14:46+00:00 [production]
2014-06-08 §
23:27 <p858snake|l> icinga has been shitting in the channel for 9+ hours (before I went to bed) about Varnishkafka, nothing noted in SAL. Here be a note about it. [production]
03:22 <LocalisationUpdate> ResourceLoader cache refresh completed at Sun Jun 8 03:21:28 UTC 2014 (duration 21m 27s) [production]
02:28 <LocalisationUpdate> completed (1.24wmf8) at 2014-06-08 02:27:21+00:00 [production]
02:15 <LocalisationUpdate> completed (1.24wmf7) at 2014-06-08 02:14:10+00:00 [production]
2014-06-07 §
23:48 <hoo> Fixed four CentralAuth log entries on meta which were logged for WikiSets/0 [production]
21:36 <manybubbles> that means I turned off puppet and shut down Elasticsearch on elastic1017 - you can expect the cluster to go yellow for half an hour or so while the other nodes take rebuild the redundency that elastic1017 had [production]
21:35 <manybubbles> after consulting logs - elastic1017 has had high io wait since it was deployed - I'm taking it out of rotation [production]
21:31 <manybubbles> elastic1017 is sick - thrashing to death on io - restarting Elasticsearch to see if it recovers unthrashed [production]
17:56 <godog> restarted ES on elastic1017.eqiad.wmnet (at 17:22 UTC) [production]
03:24 <LocalisationUpdate> ResourceLoader cache refresh completed at Sat Jun 7 03:23:32 UTC 2014 (duration 23m 31s) [production]
02:31 <LocalisationUpdate> completed (1.24wmf8) at 2014-06-07 02:29:57+00:00 [production]
02:17 <LocalisationUpdate> completed (1.24wmf7) at 2014-06-07 02:16:30+00:00 [production]
2014-06-06 §
23:51 <Krinkle> Restarted Jenkins, force stopped Zuul, started Zuul, configure Jenkins via web interface (disable Gearman, save, enable German); Seems to be back up now, finally. [production]
22:52 <mutante> same for rhenium, titanium, bast1001, calcium, carbon, ytterbium, stat1003 [production]
22:43 <RoanKattouw> Restarting Jenkins didn't help, jobs still aren't making it across from Zuul into Jenkins [production]
22:36 <RoanKattouw> Restarting stuck Jenkins [production]
22:35 <mutante> same for holmium, hafnium, silver, netmon1001, magnesium, neon, antimony [production]
22:17 <mutante> upgraded ssl packages on zirconium [production]
21:57 <Krinkle> Took Jenkins slave on gallium temporarily offline and back online to resolve possible stagnation [production]
20:56 <awight_> updated crm from ded541894a70922e098fb3ea48306c8ec0f0f6aa to b38497a9d0ef75fe2b20b03b649ac13a5e3f47a7 [production]
18:25 <mwalker> updating payments from e823354822c7a35e6c2069d3e72180a45dbc89dc to b4c5cf1bceb70d65eae28cdd0873036dc33c8992 for globalcollect oid hack [production]
14:04 <hashar> Gerrit back. chase rebooted it :) [production]
13:55 <hashar> Gerrit having some troubles: error: RPC failed; result=22, HTTP code = 503 (while cloning CirrusSearch ) [production]
12:58 <cmjohnson1> replacing raid controller db1020 [production]
06:12 <Tim> on osmium installed nodejs for testing [production]
04:24 <LocalisationUpdate> ResourceLoader cache refresh completed at Fri Jun 6 04:23:08 UTC 2014 (duration 23m 7s) [production]
03:13 <LocalisationUpdate> completed (1.24wmf8) at 2014-06-06 03:12:19+00:00 [production]
02:43 <LocalisationUpdate> completed (1.24wmf7) at 2014-06-06 02:42:28+00:00 [production]
00:38 <bblack> nginx restarted on ssl* [production]
00:16 <mutante> fixed permissions on bugzilla's index.cgi, sry [production]