2016-11-05
§
|
21:39 |
<bd808> |
Deleted huge logstash1002:/var/log/logstash/logstash.log.1 log file; disk full |
[production] |
21:36 |
<bd808@tin> |
Synchronized wmf-config/InitialiseSettings.php: logstash: Temporarily disable EventBus channel (T150106) (duration: 00m 50s) |
[production] |
19:54 |
<bd808> |
ELK stack problems are related to Elasticsearch index mapping. Some events are being rejected for not matching the expected mappings and that is filling up the disk on the logstash injestion hosts |
[production] |
19:45 |
<bd808> |
Forced several puppet runs on logstash1001 until things stopped changing; out of disk seemed to have messed up apt upgrades |
[production] |
19:38 |
<bd808> |
Elasticsearch on logstash1001 won't restart due to missing /etc/elasticsearch/scripts directory |
[production] |
19:23 |
<bd808> |
Restarted logstash on logstash1001 |
[production] |
19:14 |
<bd808> |
Deleted huge logstash1001:/var/log/logstash/logstash.log.1 log file; disk full and difficult to debug with no free space on / |
[production] |
02:21 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Sat Nov 5 02:21:37 UTC 2016 (duration 4m 36s) |
[production] |
02:17 |
<l10nupdate@tin> |
scap sync-l10n completed (1.29.0-wmf.1) (duration: 05m 25s) |
[production] |
2016-11-04
§
|
22:43 |
<godog> |
stop puppet on einsteinium and tegment to avoid log spam - T150061 |
[production] |
21:14 |
<urandom> |
T133395: Starting user-defined compaction of local_group_wikipedia_T_parsoid_html.data, files la-169018-big-Data.db and la-171488-big-Data.db |
[production] |
21:06 |
<godog> |
compress huge daemon.log on einsteinium into /srv/ |
[production] |
18:11 |
<moritzm> |
uploaded new jessie linux package based on 4.4.30 to carbon |
[production] |
18:01 |
<paravoid> |
moving mc1033-mc1036 from asw-d-eqiad to asw2-d-eqiad |
[production] |
17:54 |
<paravoid> |
reactivating cr1-eqiad:ae4 and its subinterfaces (VRRP bug seems to have been worked around) |
[production] |
17:44 |
<paravoid> |
moved cr1-eqiad:ae4 links from asw-d-eqiad:ae1 to to asw2-d-eqiad:ae1 |
[production] |
16:38 |
<ema> |
upgrading cp4018 (text-ulsfo) to varnish 4 -- T131503 |
[production] |
16:22 |
<ema> |
upgrading cp4017 (text-ulsfo) to varnish 4 -- T131503 |
[production] |
16:00 |
<ema> |
upgrading cp4016 (text-ulsfo) to varnish 4 -- T131503 |
[production] |
15:37 |
<ema> |
upgrading cp4010 (text-ulsfo) to varnish 4 -- T131503 |
[production] |
15:36 |
<paravoid> |
set up 4x10G (ae0) links between asw-d-eqiad<->asw2-d-eqiad |
[production] |
15:35 |
<marostegui> |
reimage dbstore2002 - T150017 |
[production] |
15:20 |
<reedy@tin> |
Synchronized wmf-config/InitialiseSettings.php: Remove wikitech bot group (duration: 00m 47s) |
[production] |
15:17 |
<reedy@tin> |
Synchronized wmf-config/CommonSettings.php: Simplify some wikitech config (duration: 00m 47s) |
[production] |
15:16 |
<reedy@tin> |
Synchronized wmf-config/wikitech.php: Stop double loading OATHAuth now, remove commented config (duration: 00m 47s) |
[production] |
15:15 |
<reedy@tin> |
Synchronized wmf-config/InitialiseSettings.php: Normalise wikitech OATHAuth loading config (duration: 00m 48s) |
[production] |
15:06 |
<reedy@tin> |
Synchronized wmf-config/InitialiseSettings.php: Enable OATHAuth on all private wikis (duration: 00m 49s) |
[production] |
15:04 |
<reedy@tin> |
Synchronized wmf-config/CommonSettings.php: Raise password requirements for private wikis, Abuse filter editors on enwiki, and make minimum bot password length to 8 (duration: 00m 47s) |
[production] |
15:02 |
<reedy@tin> |
Synchronized wmf-config/InitialiseSettings.php: stage wmgElevateDefaultPasswordPolicy (duration: 00m 48s) |
[production] |
14:49 |
<ema> |
upgrading cp4009 (text-ulsfo) to varnish 4 -- T131503 |
[production] |
14:14 |
<ema> |
upgrading cp4008 (text-ulsfo) to varnish 4 -- T131503 |
[production] |
11:33 |
<mobrovac> |
restarting zotero |
[production] |
10:58 |
<moritzm> |
installing tar security updates |
[production] |
10:21 |
<moritzm> |
upgrading memcached on swift frontend servers in esams |
[production] |
10:00 |
<jynus> |
stopping db2011 for backup and reimage |
[production] |
09:59 |
<moritzm> |
upgrading memcached on swift frontend servers in codfw |
[production] |
09:54 |
<moritzm> |
upgrading memcached on jessie graphite systems |
[production] |
09:26 |
<_joe_> |
rebooting copper to allow enabling the memory cgroup |
[production] |
09:10 |
<marostegui> |
Reimage db2034 - T149553 |
[production] |
07:20 |
<jynus> |
disabling alerting for slave lag fleet-wide for 1 hour to deploy new alerting script |
[production] |
06:52 |
<_joe_> |
restarted manually varnish text-backend on cp3041 - failing automatic restarts with "no space left on device" |
[production] |
02:31 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Fri Nov 4 02:31:26 UTC 2016 (duration 4m 39s) |
[production] |
02:26 |
<l10nupdate@tin> |
scap sync-l10n completed (1.29.0-wmf.1) (duration: 09m 08s) |
[production] |
01:54 |
<madhuvishy> |
Manually reimaging labstore2003 (T149870) |
[production] |
01:35 |
<catrope@tin> |
Synchronized php-1.29.0-wmf.1/extensions/Thanks: Avoid breakage after Flow uninstallation (duration: 00m 47s) |
[production] |
01:18 |
<catrope@terbium> |
scap failed: IOError [Errno 13] Permission denied: u'/srv/mediawiki-staging/wmf-config/ExtensionMessages-1.29.0-wmf.1.php' (duration: 00m 20s) |
[production] |
01:17 |
<catrope@terbium> |
Started scap: (no message) |
[production] |
00:48 |
<catrope@tin> |
Synchronized dblists/: Disable Flow on enwiki (T148611) (duration: 01m 04s) |
[production] |