1301-1350 of 10000 results (29ms)
2014-10-27 §
19:39 <manybubbles> after restarting elasticsearch we expected to get memory errors again. no such luck so far.... [production]
18:57 <manybubbles> completed restarting elasticsearch cluster. now it'll make a useful file on out of memory errors. raised the recovery throttling so it'll recover fast enough to cause oom errors [production]
18:48 <maxsem> Synchronized php-1.25wmf4/extensions/GeoData: live hack to disable geosearch (duration: 00m 04s) [production]
18:37 <manybubbles> note that this is a restart without waiting for the cluster to go green after each restart. I expect lots of whining from icinga. This will cause us to lose some updates but should otherwise be safe. [production]
18:34 <manybubbles> restarting elasticsearch servers to pick up new gc logging and to reset them into a "working" state so they can have their gc problem again and we can log it properly this time. [production]
18:15 <aaron> Synchronized wmf-config/CommonSettings.php: Remove obsolete flags (all of them) from $wgAntiLockFlags (duration: 00m 07s) [production]
17:53 <cmjohnson> replacing disk /dev/sdl slot 11 ms-be1013 [production]
17:37 <_joe_> uploaded a version of jemalloc for trusty with --enable-prof [production]
16:31 <^d> elasticsearch: temporarily raised node_concurrent_recoveries from 3 to 5. [production]
15:32 <demon> Synchronized wmf-config/InitialiseSettings.php: Enable Cirrus as secondary everywhere, brings back GeoData (duration: 00m 04s) [production]
15:08 <manybubbles> Its unclear how much of the master going haywire is something that'll be fixed in elasticsearch 1.4. They've done a lot of work there on the cluster state communication. [production]
15:03 <manybubbles> restarting gmond on all elasticsearch systems because stats aren't updating properly in ganglia and usually that helps [production]
15:02 <manybubbles> restarted a bunch of the elasticsearch nodes that had their heap full. wasn't able to get a heap dump on any of them because they all froze while trying to get the heap dump. [production]
14:32 <^d> elasticsearch: disabling replica allocation, less things moving about if we restart cluster [production]
13:47 <manybubbles> Synchronized wmf-config/InitialiseSettings.php: fall back to lsearchd for a bit (duration: 00m 05s) [production]
13:41 <manybubbles> Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 05s) [production]
13:29 <manybubbles> restarted elasticsearch on elastic1017 - memory was totally full there [production]
13:21 <manybubbles> elastic1008 is logging gc issues. restarting it because that might help it [production]
05:04 <springle> forced logrotate ocg1001 [production]
03:36 <LocalisationUpdate> ResourceLoader cache refresh completed at Mon Oct 27 03:36:39 UTC 2014 (duration 36m 38s) [production]
02:27 <LocalisationUpdate> completed (1.25wmf5) at 2014-10-27 02:27:45+00:00 [production]
02:17 <LocalisationUpdate> completed (1.25wmf4) at 2014-10-27 02:17:08+00:00 [production]
2014-10-26 §
23:46 <Krinkle> Force restarted Zuul [production]
15:14 <Guest19240> Jenkins/Zuul is stuck as of 20 hours ago [production]
15:06 <_joe_> restarted hhvm on mw1114, memory nearly exhausted [production]
03:36 <LocalisationUpdate> ResourceLoader cache refresh completed at Sun Oct 26 03:36:20 UTC 2014 (duration 36m 19s) [production]
02:25 <LocalisationUpdate> completed (1.25wmf5) at 2014-10-26 02:25:47+00:00 [production]
02:15 <LocalisationUpdate> completed (1.25wmf4) at 2014-10-26 02:15:12+00:00 [production]
2014-10-25 §
22:49 <paravoid> upgrading JunOS on cr1-ulsfo [production]
22:32 <paravoid> scheduling downtime for all ulsfo -lb- & cr1/2-ulsfo [production]
21:30 <ori> Synchronized php-1.25wmf5/extensions/CentralNotice/CentralNotice.hooks.php: Iee2072ac7: Make sure we declare globals before using them (duration: 00m 06s) [production]
21:30 <ori> Synchronized php-1.25wmf4/extensions/CentralNotice/CentralNotice.hooks.php: Iee2072ac7: Make sure we declare globals before using them (duration: 00m 06s) [production]
20:41 <bd808> updated logstash-* labs instances to salt minion 2014.1.11 (thanks for the ping apergos) [production]
03:46 <LocalisationUpdate> ResourceLoader cache refresh completed at Sat Oct 25 03:46:48 UTC 2014 (duration 46m 47s) [production]
02:29 <LocalisationUpdate> completed (1.25wmf5) at 2014-10-25 02:29:29+00:00 [production]
02:18 <LocalisationUpdate> completed (1.25wmf4) at 2014-10-25 02:18:14+00:00 [production]
00:27 <awight> updated DjangoBannerStats from cf5a875d49f4c4cf229d7f864a73d4c2f588ebf9 to a3038f133d64c737d3987bd1c37a987fd3003dd6 [production]
2014-10-24 §
22:40 <akosiaris> puppet disabled on uranium, do not enable [production]
20:52 <andrewbogott> revived virt1006 on a probationary basis. It's running compute but is disabled so new instances won't be scheduled there. I've moved a few test instances there to see how it behaves. [production]
20:36 <andrew> Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 04s) [production]
20:29 <Reedy> sync-common on mw1088 [production]
20:23 <mutante> mw1088 - gzipping core dump files, disabled core dumps, restarted apache [production]
20:15 <mutante> mw1088 - gzip other_vhosts_access.log.1 - Avail. 38G [production]
20:15 <Reedy> / full on mw1088 due to apache core dumps [production]
20:09 <Reedy> running sync-common on mw1041 [production]
20:04 <mutante> powercycled mw1041 [production]
20:03 <reedy> Synchronized php-1.25wmf5/extensions/SemanticForms/: noop for prod (duration: 00m 17s) [production]
20:01 <Reedy> mw1041 is down [production]
20:01 <Reedy> mw1088 has a full / [production]
20:00 <reedy> Synchronized php-1.25wmf4/extensions/SemanticForms/: noop for prod (duration: 00m 16s) [production]