651-700 of 10000 results (25ms)
2020-10-13 ยง
21:16 <mutante> icinga had gerrit health alert but did not notice an issue myself and was gone next check [production]
21:12 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
21:12 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
21:09 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
21:07 <andrew@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:44 <mutante> bast1002 - apt-get autoremove - cleans up golang and ruby packages [production]
20:44 <mutante> bast1002 - apt-get remove nmap (it can be used on netmon hosts and was not consistent with other bast hosts) [production]
20:15 <ebernhardson> unban elastic2029 from production-search-psi-codfw [production]
20:14 <ebernhardson> restart production-search-psi-codfw on elastic2029 to reset any wonkiness from gc hell [production]
20:06 <marxarelli> 1.36.0-wmf.13 promoted to group0. no new or concerning errors or changes in error rates (T263179) [production]
20:03 <ebernhardson> add elastic2029-production-search-psi-codfw to cluster.routing.allocatin.exclude._name to drain active shards, instance currently in gc hell [production]
19:54 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.13 [production]
19:52 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
19:49 <andrew@cumin1001> START - Cookbook sre.hosts.downtime [production]
19:40 <dduvall@deploy1001> Finished scap: testwikis wikis to 1.36.0-wmf.13 (duration: 40m 51s) [production]
19:00 <dduvall@deploy1001> Started scap: testwikis wikis to 1.36.0-wmf.13 [production]
18:58 <dduvall@deploy1001> Pruned MediaWiki: 1.36.0-wmf.9 (duration: 01m 56s) [production]
18:56 <dduvall@deploy1001> Pruned MediaWiki: 1.36.0-wmf.8 (duration: 02m 10s) [production]
18:53 <dduvall@deploy1001> Pruned MediaWiki: 1.36.0-wmf.6 (duration: 13m 00s) [production]
18:23 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.11 [production]
18:21 <marxarelli> 1.36.0-wmf.11 promoted to group1. no new errors (T263177). promoting to all wikis [production]
18:10 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
18:09 <robh> scs-c1-codfw mgmt firmware updated, updating scs-a1-codfw T238036 [production]
18:08 <andrew@cumin1001> START - Cookbook sre.hosts.downtime [production]
18:01 <robh> scs-c1-codfw firmware update via T238036 [production]
17:47 <marxarelli> 1.36.0-wmf.13 branched at a6be801fc6331a6a6b96f02f368750200d50ab09 for T263179 [production]
17:35 <dduvall@deploy1001> Synchronized php: group1 wikis to 1.36.0-wmf.11 (duration: 01m 07s) [production]
17:34 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.11 [production]
17:32 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
17:32 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
17:30 <marxarelli> 1.36.0-wmf.11 promoted to group0. no new errors (T263177). preparing to promote to group1 [production]
17:18 <ppchelko@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'production' . [production]
17:18 <ppchelko@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'canary' . [production]
17:17 <ppchelko@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams' for release 'production' . [production]
17:16 <ppchelko@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams' for release 'canary' . [production]
17:15 <ppchelko@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventstreams' for release 'canary' . [production]
17:15 <ppchelko@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventstreams' for release 'production' . [production]
16:39 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.11 [production]
16:31 <ebernhardson@deploy1001> Finished deploy [wikimedia/discovery/analytics@77febb6]: airflow: parameterize active mediawiki dc (duration: 05m 29s) [production]
16:26 <ebernhardson@deploy1001> Started deploy [wikimedia/discovery/analytics@77febb6]: airflow: parameterize active mediawiki dc [production]
15:56 <papaul> power down ms-be2036 for maintenance [production]
15:02 <godog> bounce logstash on logstash1007, GC death [production]
14:41 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:39 <andrew@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:18 <urbanecm@deploy1001> Synchronized wmf-config/CommonSettings.php: 5b28fd685b9cb8d8e93650b5d02bc41b81d0883c: Add setmentor to wgAvailableRights (duration: 00m 59s) [production]
13:42 <jayme@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
13:40 <jayme@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' . [production]
13:15 <Urbanecm> [urbanecm@mwmaint2001 ~]$ mwscript namespaceDupes.php --wiki=trwiki --add-prefix=BROKEN --fix # T265336 [production]
13:08 <moritzm> imported php-mailparse, php-mongodb, php-msgpack to component/icu63 T264991 [production]
12:50 <Urbanecm> urbanecm@mwmaint2001:~$ mwscript namespaceDupes.php --wiki=trwiki --add-prefix=FIXME --fix # T265336 [production]