7051-7100 of 10000 results (28ms)
2020-11-19 §
23:50 <robh@cumin1001> START - Cookbook sre.hosts.downtime [production]
23:23 <robh@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
23:21 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
23:19 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
23:18 <robh@cumin1001> START - Cookbook sre.hosts.downtime [production]
23:18 <robh@cumin1001> START - Cookbook sre.hosts.downtime [production]
23:17 <robh@cumin1001> START - Cookbook sre.hosts.downtime [production]
23:06 <razzi@cumin1001> START - Cookbook sre.ganeti.makevm [production]
22:54 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
22:52 <robh@cumin1001> START - Cookbook sre.hosts.downtime [production]
22:23 <razzi@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
22:07 <razzi@cumin1001> START - Cookbook sre.ganeti.makevm [production]
22:06 <krinkle@deploy1001> Synchronized php-1.36.0-wmf.16/includes/filerepo/: T267668 - I1115135ee, and Ic239bb9807 (duration: 01m 07s) [production]
20:19 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
20:17 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
20:12 <herron> upgraded logstash-next to kibana 7.10 [production]
19:23 <otto@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
19:23 <otto@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
19:20 <otto@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
19:20 <otto@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
19:14 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
18:48 <mutante> gerrit1001 - re-enabling puppet after merging gerrit:642086 for T268260 (upstream bug 13701) [production]
18:41 <mutante> gerrit1001 - added RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME} in apache config, reloaded apache to fix redirect issue [production]
18:37 <mutante> gerrit1001 - disabled puppet [production]
18:19 <ryankemper@cumin1001> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) [production]
18:07 <ryankemper@cumin1001> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) [production]
18:03 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) [production]
17:59 <clarakosi@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventstreams' for release 'production' . [production]
17:47 <clarakosi@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventstreams' for release 'production' . [production]
17:33 <hashar@deploy1001> Finished deploy [gerrit/gerrit@9d27055]: Upgrade gerrit1001 (primary) to Gerrit 3.2.5 (duration: 00m 09s) [production]
17:33 <hashar@deploy1001> Started deploy [gerrit/gerrit@9d27055]: Upgrade gerrit1001 (primary) to Gerrit 3.2.5 [production]
17:32 <hashar> Upgrading Gerrit to 3.2.5 and restarting it [production]
17:05 <dancy@deploy1001> Synchronized php: group1 wikis to 1.36.0-wmf.16 (duration: 01m 06s) [production]
17:04 <dancy@deploy1001> rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.16 [production]
16:59 <ryankemper> T246345 [wdqs] Data-transfer of new wdqs node `wdqs1012` is complete, beginning transfer of `wdqs1004`->`wdqs1013` (public) and `wdqs1003`->`wdqs1011` (internal). Once these transfers are done `wdqs1012` and `wdqs1013` will need to be pooled and have their weights set to 10 after verifying they're healthy [production]
16:58 <kormat> started mariadb on pc2010, now with more 🤞 [production]
16:58 <ryankemper@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
16:54 <kormat> stopping mariadb on pc2010 [production]
16:54 <ryankemper@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
16:43 <hashar> Restarting Gerrit replica instance on gerrit2001 [production]
16:42 <hashar@deploy1001> Finished deploy [gerrit/gerrit@9d27055]: Upgrade gerrit2001 to Gerrit 3.2.5 (take 2 after rebasing deploy server) (duration: 00m 10s) [production]
16:42 <hashar@deploy1001> Started deploy [gerrit/gerrit@9d27055]: Upgrade gerrit2001 to Gerrit 3.2.5 (take 2 after rebasing deploy server) [production]
16:41 <kormat> stopped and started replication on pc2010 to see if that would help it recover [production]
16:40 <hashar@deploy1001> Finished deploy [gerrit/gerrit@5a41181]: Upgrade gerrit2001 to Gerrit 3.2.5 (duration: 00m 05s) [production]
16:40 <hashar@deploy1001> Started deploy [gerrit/gerrit@5a41181]: Upgrade gerrit2001 to Gerrit 3.2.5 [production]
16:35 <elukey> roll restart hadoop workers for openjdk upgrades [production]
16:35 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers [production]
16:06 <elukey@cumin1001> END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) [production]
15:58 <moritzm> installing jupyter-notebook security updates on an-coord* [production]
15:56 <elukey@cumin1001> START - Cookbook sre.presto.roll-restart-workers [production]