2015-07-11
§
|
19:48 |
<jynus> |
stopping labsdb1002 after table corruption has been detected |
[production] |
19:37 |
<urandom> |
from restbase1002, starting revision culling process (node thin_out_key_rev_value_data.js `hostname -i` local_group_wikimedia_T_parsoid_html 2>&1 | tee >(gzip -c > local_group_wikimedia_T_parsoid_html.log.`date +%s`.gz)) |
[production] |
19:33 |
<urandom> |
restbase: setting gc_grace_seconds to 604800 (1 week) on local_group_wikipedia_T_parsoid_html.data |
[production] |
04:56 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Sat Jul 11 04:55:56 UTC 2015 (duration 55m 55s) |
[production] |
04:21 |
<bd808> |
Logstash cluster upgrade complete! Kibana working again |
[production] |
04:21 |
<bd808> |
Upgraded Elasticsearch to 1.6.0 on logstash1006 |
[production] |
04:12 |
<bd808> |
rebooting logstash1006 |
[production] |
04:07 |
<bd808> |
logstash1005 fully recovered all shards |
[production] |
03:21 |
<mattflaschen> |
Synchronized php-1.26wmf13/extensions/Flow/includes/Parsoid/Utils.php: Bump Flow to encode page name when sending to Parsoid (duration: 00m 13s) |
[production] |
02:28 |
<LocalisationUpdate> |
completed (1.26wmf13) at 2015-07-11 02:28:18+00:00 |
[production] |
02:25 |
<l10nupdate> |
Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 06m 07s) |
[production] |
02:25 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Sat Jul 11 02:25:19 UTC 2015 (duration 25m 18s) |
[production] |
02:09 |
<LocalisationUpdate> |
completed (1.26wmf13) at 2015-07-11 02:09:45+00:00 |
[production] |
02:09 |
<l10nupdate> |
Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 00m 35s) |
[production] |
00:46 |
<bd808> |
Upgraded Elasticsearch to 1.6.0 on logstash1005; replicas recovering now |
[production] |
00:35 |
<bd808> |
rebooting logstash1005 |
[production] |
00:30 |
<bd808> |
logstash1004 fully recovered all shards |
[production] |
2015-07-10
§
|
22:51 |
<mutante> |
tendril: very short maintenance downtime |
[production] |
20:10 |
<bd808> |
`service elasticsearch start` not starting on logstash1004; investigating |
[production] |
20:07 |
<bd808> |
ran apt-get upgrade on logstash1004 |
[production] |
19:52 |
<mutante> |
adminbot - built and imported 1.7.10 into APT repo |
[production] |
19:43 |
<bd808> |
rebooting logstash1004 |
[production] |
19:40 |
<bd808> |
Kibana seems to be broken by mixed 1.6.0/1.3.9 cluster |
[production] |
19:32 |
<bd808> |
kibana not seeing indices after upgrading elasticsearch to 1.6.0; investigating |
[production] |
19:26 |
<bd808> |
Upgraded logstash1003 to elasticsearch 1.6.0 |
[production] |
19:22 |
<bd808> |
Upgraded logstash1002 to elasticsearch 1.6.0 |
[production] |
19:19 |
<bd808> |
Upgraded logstash1001 to elasticsearch 1.6.0 |
[production] |
19:10 |
<krenair> |
Synchronized php-1.26wmf13/extensions/VisualEditor/lib/ve/src/ce/nodes/ve.ce.TableNode.js: https://gerrit.wikimedia.org/r/#/c/224122/ (duration: 00m 12s) |
[production] |
18:11 |
<gwicke> |
ansible -i production restbase -a 'nodetool setcompactionthroughput 120' |
[production] |
18:00 |
<gwicke> |
ansible -i production restbase -a 'nodetool setcompactionthroughput 90' |
[production] |
17:49 |
<gwicke> |
rolling restart of the cassandra cluster to apply https://gerrit.wikimedia.org/r/#/c/224114/ |
[production] |
17:32 |
<demon> |
Synchronized wmf-config/CommonSettings.php: prevent race condition on writing settings (duration: 00m 13s) |
[production] |
17:26 |
<moritzm> |
installed python security updates on mc* |
[production] |
17:25 |
<Coren> |
rebooting labstore2001 (experiments with the new raid setup caused the mapper table to fill) |
[production] |
16:35 |
<mobrovac> |
restbase deploying hotfix for T105509 |
[production] |
15:29 |
<mobrovac> |
restbase restarted restabse on restbase1004 |
[production] |
15:25 |
<godog> |
bounce cassandra on restbae1004 |
[production] |
13:43 |
<godog> |
bounce cassandra on restbae1004 |
[production] |
13:37 |
<_joe_> |
temporarily repooled mw1031 |
[production] |
12:40 |
<godog> |
bounce cassandra on restbae1004 |
[production] |
07:43 |
<godog> |
reimage ms-be2013 T105213 |
[production] |
04:36 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Fri Jul 10 04:36:49 UTC 2015 (duration 36m 48s) |
[production] |
04:33 |
<springle> |
Synchronized wmf-config/db-eqiad.php: depool db1037; repool db1030 (revert below) (duration: 00m 12s) |
[production] |
04:28 |
<springle> |
Synchronized wmf-config/db-eqiad.php: repool db1037; depool db1030 (duration: 00m 13s) |
[production] |
03:14 |
<mutante> |
re-enabling puppet on tools-exec-1213, working around adminbot package install fail |
[production] |
02:59 |
<elee> |
please log this with the year |
[production] |
02:53 |
<andrewbogott> |
testing the log by logging a test |
[production] |
01:50 |
<gwicke> |
bounced cassandra on restbase1004 |
[production] |
01:38 |
<jgage> |
cassandra restarted on restbase1004 |
[production] |
00:39 |
<urandom> |
starting restbase1004 |
[production] |