2020-08-19
ยง
|
18:13 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
18:13 |
<ppchelko@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
17:38 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
17:38 |
<mutante> |
decom'ing releases2001.codfw.wmnet ( |
[production] |
17:37 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
16:39 |
<ppchelko@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
16:37 |
<ppchelko@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
16:32 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
16:30 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
15:41 |
<rzl> |
finished exercising the switchdc cookbooks with --live-test for now, all changes reverted including re-enabling puppet on cumin1001 |
[production] |
15:38 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) |
[production] |
15:37 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-start-maintenance |
[production] |
15:34 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) |
[production] |
15:34 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-restore-ttl |
[production] |
15:33 |
<rzl@cumin1001> |
END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99) |
[production] |
15:33 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-restore-ttl |
[production] |
15:31 |
<jbond42> |
update java.security https://gerrit.wikimedia.org/r/c/operations/puppet/+/593467 |
[production] |
15:30 |
<oblivian@cumin1001> |
conftool action : set/ttl=300; selector: dnsdisc=api-rw |
[production] |
15:26 |
<rzl@cumin1001> |
END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99) |
[production] |
15:26 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl |
[production] |
15:22 |
<rzl@cumin1001> |
END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99) |
[production] |
15:22 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-restore-ttl |
[production] |
15:18 |
<godog> |
prometheus codfw lvextend --resizefs --size +80G /dev/mapper/vg--ssd-prometheus--ops |
[production] |
15:17 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) |
[production] |
15:17 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl |
[production] |
15:16 |
<rzl@cumin1001> |
END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99) |
[production] |
15:16 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl |
[production] |
15:14 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) |
[production] |
15:14 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-restore-ttl |
[production] |
15:08 |
<rzl@cumin1001> |
END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=99) |
[production] |
15:08 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-restore-ttl |
[production] |
15:06 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
15:04 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:50 |
<rzl@cumin1001> |
END (FAIL) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=99) |
[production] |
14:50 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl |
[production] |
14:50 |
<rzl> |
running the switchdc cookbooks with --live-test, simulating a switch to eqiad where we're already running, no production impact is expected |
[production] |
14:47 |
<rzl@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0) |
[production] |
14:47 |
<rzl@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.00-disable-puppet |
[production] |
14:41 |
<rzl> |
disable puppet on cumin1001 for switchdc testing |
[production] |
14:35 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
14:33 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:27 |
<ppchelko@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
13:38 |
<ppchelko@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
13:34 |
<gehel> |
depooling wdqs1007 and restarting blazegraph |
[production] |
13:29 |
<_joe_> |
depooling and disabling puppet on restbase1024 for further investigation |
[production] |
13:27 |
<ppchelko@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' . |
[production] |
13:26 |
<ppchelko@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' . |
[production] |
13:25 |
<ppchelko@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . |
[production] |
13:10 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
13:03 |
<_joe_> |
building and uploading fluent-bit, ratelimit images |
[production] |