2014-10-27
§
|
17:37 |
<_joe_> |
uploaded a version of jemalloc for trusty with --enable-prof |
[production] |
16:31 |
<^d> |
elasticsearch: temporarily raised node_concurrent_recoveries from 3 to 5. |
[production] |
15:32 |
<demon> |
Synchronized wmf-config/InitialiseSettings.php: Enable Cirrus as secondary everywhere, brings back GeoData (duration: 00m 04s) |
[production] |
15:08 |
<manybubbles> |
Its unclear how much of the master going haywire is something that'll be fixed in elasticsearch 1.4. They've done a lot of work there on the cluster state communication. |
[production] |
15:03 |
<manybubbles> |
restarting gmond on all elasticsearch systems because stats aren't updating properly in ganglia and usually that helps |
[production] |
15:02 |
<manybubbles> |
restarted a bunch of the elasticsearch nodes that had their heap full. wasn't able to get a heap dump on any of them because they all froze while trying to get the heap dump. |
[production] |
14:32 |
<^d> |
elasticsearch: disabling replica allocation, less things moving about if we restart cluster |
[production] |
13:47 |
<manybubbles> |
Synchronized wmf-config/InitialiseSettings.php: fall back to lsearchd for a bit (duration: 00m 05s) |
[production] |
13:41 |
<manybubbles> |
Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 05s) |
[production] |
13:29 |
<manybubbles> |
restarted elasticsearch on elastic1017 - memory was totally full there |
[production] |
13:21 |
<manybubbles> |
elastic1008 is logging gc issues. restarting it because that might help it |
[production] |
05:04 |
<springle> |
forced logrotate ocg1001 |
[production] |
03:36 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Mon Oct 27 03:36:39 UTC 2014 (duration 36m 38s) |
[production] |
02:27 |
<LocalisationUpdate> |
completed (1.25wmf5) at 2014-10-27 02:27:45+00:00 |
[production] |
02:17 |
<LocalisationUpdate> |
completed (1.25wmf4) at 2014-10-27 02:17:08+00:00 |
[production] |
2014-10-25
§
|
22:49 |
<paravoid> |
upgrading JunOS on cr1-ulsfo |
[production] |
22:32 |
<paravoid> |
scheduling downtime for all ulsfo -lb- & cr1/2-ulsfo |
[production] |
21:30 |
<ori> |
Synchronized php-1.25wmf5/extensions/CentralNotice/CentralNotice.hooks.php: Iee2072ac7: Make sure we declare globals before using them (duration: 00m 06s) |
[production] |
21:30 |
<ori> |
Synchronized php-1.25wmf4/extensions/CentralNotice/CentralNotice.hooks.php: Iee2072ac7: Make sure we declare globals before using them (duration: 00m 06s) |
[production] |
20:41 |
<bd808> |
updated logstash-* labs instances to salt minion 2014.1.11 (thanks for the ping apergos) |
[production] |
03:46 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Sat Oct 25 03:46:48 UTC 2014 (duration 46m 47s) |
[production] |
02:29 |
<LocalisationUpdate> |
completed (1.25wmf5) at 2014-10-25 02:29:29+00:00 |
[production] |
02:18 |
<LocalisationUpdate> |
completed (1.25wmf4) at 2014-10-25 02:18:14+00:00 |
[production] |
00:27 |
<awight> |
updated DjangoBannerStats from cf5a875d49f4c4cf229d7f864a73d4c2f588ebf9 to a3038f133d64c737d3987bd1c37a987fd3003dd6 |
[production] |
2014-10-24
§
|
22:40 |
<akosiaris> |
puppet disabled on uranium, do not enable |
[production] |
20:52 |
<andrewbogott> |
revived virt1006 on a probationary basis. It's running compute but is disabled so new instances won't be scheduled there. I've moved a few test instances there to see how it behaves. |
[production] |
20:36 |
<andrew> |
Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 04s) |
[production] |
20:29 |
<Reedy> |
sync-common on mw1088 |
[production] |
20:23 |
<mutante> |
mw1088 - gzipping core dump files, disabled core dumps, restarted apache |
[production] |
20:15 |
<mutante> |
mw1088 - gzip other_vhosts_access.log.1 - Avail. 38G |
[production] |
20:15 |
<Reedy> |
/ full on mw1088 due to apache core dumps |
[production] |
20:09 |
<Reedy> |
running sync-common on mw1041 |
[production] |
20:04 |
<mutante> |
powercycled mw1041 |
[production] |
20:03 |
<reedy> |
Synchronized php-1.25wmf5/extensions/SemanticForms/: noop for prod (duration: 00m 17s) |
[production] |
20:01 |
<Reedy> |
mw1041 is down |
[production] |
20:01 |
<Reedy> |
mw1088 has a full / |
[production] |
20:00 |
<reedy> |
Synchronized php-1.25wmf4/extensions/SemanticForms/: noop for prod (duration: 00m 16s) |
[production] |
19:53 |
<bblack> |
nickel's basically dead, uranium has been promoted to prod ganglia a little early for now |
[production] |
19:22 |
<awight> |
updated payments from 6fa864d4aaa22b9f271de4bc662be68bb0b40b56 to 525988487d6bbd08ddad50badd88e34e34104292 |
[production] |
18:55 |
<ori> |
repooled mw1189 to do heap profiling on production api workload. |
[production] |
17:58 |
<mutante> |
stat1001 - Duplicate declaration: Package[nodejs] |
[production] |
17:08 |
<cmjohnson> |
getting ready to replace a failed disk on ganglia (server:nickel)...it will be offline for a few minutes |
[production] |
17:05 |
<ejegg> |
updated dash from 58fda9403dd33e4d47238f119b6bb2b2905856b1 to 69c9330d6983873ffa9bb87fcd783be03382bdfc |
[production] |
15:50 |
<awight> |
campaigns reenabled |
[production] |