2020-10-14
§
|
15:58 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.reboot-workers |
[production] |
15:55 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) |
[production] |
15:29 |
<elukey> |
drain + reboot an-worker110[1,2] to pick up GPU settings - T255138 |
[production] |
15:28 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.reboot-workers |
[production] |
15:26 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) |
[production] |
15:24 |
<jayme> |
enabled and ran puppet on deploy1001 - T260917 |
[production] |
14:56 |
<elukey> |
drain + reboot an-worker109[8,9] to pick up GPU settings - T255138 |
[production] |
14:55 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.reboot-workers |
[production] |
14:12 |
<jayme> |
disable-puppet on deploy1001 to test a change in hemlfile puppet on deploy2001 only - T260917 |
[production] |
14:01 |
<akosiaris> |
push a 6GB image, named docker-registry.discovery.wmnet/mwcachedir:0.0.1, containing the cache/ dir of a mediawiki installation to the registry. T264209 |
[production] |
14:01 |
<akosiaris> |
push a 6GB image, named docker-registry.discovery.wmnet/mwcachedir:0.0.1, containing the cache/ dir of a mediawiki installation to the registry. T265183 |
[production] |
13:53 |
<jbond42> |
enable puppet fleet wide post - convert puppetdb stockpile queue to tmpfs |
[production] |
13:48 |
<jbond42> |
disable puppet fleet wide to convert puppetdb stockpile queue to tmpfs |
[production] |
12:46 |
<vgutierrez> |
Bump ECDHE-ECDSA-AES128-SHA pageview replacement to 10% - T258405 |
[production] |
11:50 |
<hnowlan@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' . |
[production] |
11:50 |
<hnowlan@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
11:48 |
<hnowlan@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
11:48 |
<hnowlan@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' . |
[production] |
11:43 |
<moritzm> |
imported php-memcached, php-redis to component/icu63 T264991 |
[production] |
11:25 |
<Urbanecm> |
EU B&C window completed |
[production] |
11:22 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: c63632de6a20b2f00da91187e5cf416fd39d8c5b: Enable DiscussionTools as a beta feature on 30 more wikis (T264693) (duration: 01m 15s) |
[production] |
11:16 |
<moritzm> |
imported php-igbinary, php-apcu-bc to component/icu63 T264991 |
[production] |
09:59 |
<moritzm> |
imported php-wmerrors, tideways, tideways-xhprof, wikidiff2, xdebug to component/icu63 T264991 |
[production] |
08:34 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
08:28 |
<elukey@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
08:09 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
08:09 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
07:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 100%: Slowly repool db2125 after on-site maintenance T260670 ', diff saved to https://phabricator.wikimedia.org/P12988 and previous config saved to /var/cache/conftool/dbconfig/20201014-071440-root.json |
[production] |
06:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 75%: Slowly repool db2125 after on-site maintenance T260670 ', diff saved to https://phabricator.wikimedia.org/P12987 and previous config saved to /var/cache/conftool/dbconfig/20201014-065936-root.json |
[production] |
06:44 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 50%: Slowly repool db2125 after on-site maintenance T260670 ', diff saved to https://phabricator.wikimedia.org/P12986 and previous config saved to /var/cache/conftool/dbconfig/20201014-064433-root.json |
[production] |
06:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 40%: Slowly repool db2125 after on-site maintenance T260670 ', diff saved to https://phabricator.wikimedia.org/P12985 and previous config saved to /var/cache/conftool/dbconfig/20201014-062930-root.json |
[production] |
06:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 20%: Slowly repool db2125 after on-site maintenance T260670 ', diff saved to https://phabricator.wikimedia.org/P12984 and previous config saved to /var/cache/conftool/dbconfig/20201014-061426-root.json |
[production] |
06:12 |
<marostegui> |
Change UNIQUE into KEY on enwikivoyage.imagelinks T265445 |
[production] |
05:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 30%: Slowly repool db2125 after on-site maintenance T260670 ', diff saved to https://phabricator.wikimedia.org/P12983 and previous config saved to /var/cache/conftool/dbconfig/20201014-055923-root.json |
[production] |
05:44 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2125 (re)pooling @ 10%: Slowly repool db2125 after on-site maintenance T260670 ', diff saved to https://phabricator.wikimedia.org/P12982 and previous config saved to /var/cache/conftool/dbconfig/20201014-054420-root.json |
[production] |
2020-10-13
§
|
23:22 |
<catrope@deploy1001> |
Synchronized php-1.36.0-wmf.13/extensions/GrowthExperiments/: Revert removal of variant A (T265372) (duration: 01m 04s) |
[production] |
23:18 |
<catrope@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Rename GrowthExperiments help desk on ptwiki (T265214) (duration: 01m 04s) |
[production] |
23:14 |
<catrope@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Disable event logging in MediaViewer (T260582) (duration: 01m 04s) |
[production] |
23:07 |
<catrope@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Enable watchlist expiry on frwiki, fawiki, dewiki, cswiki (T264780) (duration: 01m 04s) |
[production] |
21:16 |
<mutante> |
icinga had gerrit health alert but did not notice an issue myself and was gone next check |
[production] |
21:12 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
21:12 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
21:09 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
21:07 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:44 |
<mutante> |
bast1002 - apt-get autoremove - cleans up golang and ruby packages |
[production] |
20:44 |
<mutante> |
bast1002 - apt-get remove nmap (it can be used on netmon hosts and was not consistent with other bast hosts) |
[production] |
20:15 |
<ebernhardson> |
unban elastic2029 from production-search-psi-codfw |
[production] |
20:14 |
<ebernhardson> |
restart production-search-psi-codfw on elastic2029 to reset any wonkiness from gc hell |
[production] |
20:06 |
<marxarelli> |
1.36.0-wmf.13 promoted to group0. no new or concerning errors or changes in error rates (T263179) |
[production] |
20:03 |
<ebernhardson> |
add elastic2029-production-search-psi-codfw to cluster.routing.allocatin.exclude._name to drain active shards, instance currently in gc hell |
[production] |