2021-04-08
ยง
|
18:23 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw[2410-2411].codfw.wmnet with reason: new_install |
[production] |
18:23 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw[2410-2411].codfw.wmnet with reason: new_install |
[production] |
18:22 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 7 hosts with reason: new_install |
[production] |
18:22 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 7 hosts with reason: new_install |
[production] |
18:03 |
<mutante> |
mw2403 through mw2411 - new hardware moving into production, not pooled yet, initial puppet run, being added to icinga etc, creating mcrouter certs for them (T279599) |
[production] |
18:02 |
<mutante> |
mw2403 through mw2401 - new hardwere moving into production, not pooled yet, initial puppet run, being added to icinga etc, creating mcrouter certs for them (T279599) |
[production] |
17:59 |
<tgr@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' . |
[production] |
17:52 |
<ryankemper@cumin2001> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) |
[production] |
17:29 |
<jgiannelos@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:23 |
<jgiannelos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:18 |
<jgiannelos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
17:16 |
<dancy> |
Scap 3.17.0 deployed to beta cluster |
[production] |
16:51 |
<dancy> |
testing Scap 3.17.0 release on deployment-deploy01 |
[production] |
16:33 |
<elukey> |
reboot an-worker1100 again to check if all the disks come up correctly |
[production] |
16:16 |
<cmjohnson1> |
update bios cp1087, already deposed for h/w issues T278729 |
[production] |
16:15 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1025.eqiad.wmnet with reason: REIMAGE |
[production] |
16:13 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1025.eqiad.wmnet with reason: REIMAGE |
[production] |
16:10 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
16:05 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
15:51 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:44 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
15:36 |
<elukey> |
reboot an-worker1100 to see if it helps with the strange BBU behavior |
[production] |
13:55 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon2001-dev.codfw.wmnet |
[production] |
13:44 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts cloudcephmon2001-dev.codfw.wmnet |
[production] |
13:41 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2001.codfw.wmnet with reason: REIMAGE |
[production] |
13:39 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on parse2001.codfw.wmnet with reason: REIMAGE |
[production] |
13:24 |
<moritzm> |
installing groff bugfix updates from Buster point release |
[production] |
12:49 |
<ema> |
cp5001: varnish-frontend-restart to test exp policy settings starting from a empty cache T275809 |
[production] |
12:44 |
<moritzm> |
installing libbsd security updates for Buster |
[production] |
12:39 |
<moritzm> |
installing xcftools security updates |
[production] |
12:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: Repool after schema change', diff saved to https://phabricator.wikimedia.org/P15264 and previous config saved to /var/cache/conftool/dbconfig/20210408-123137-root.json |
[production] |
12:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: Repool after schema change', diff saved to https://phabricator.wikimedia.org/P15263 and previous config saved to /var/cache/conftool/dbconfig/20210408-121633-root.json |
[production] |
12:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1157 (re)pooling @ 50%: Repool after schema change', diff saved to https://phabricator.wikimedia.org/P15262 and previous config saved to /var/cache/conftool/dbconfig/20210408-120128-root.json |
[production] |
11:58 |
<XioNoX> |
tighten all routers loopback firewall filter - T207799 |
[production] |
11:57 |
<zpapierski@deploy1002> |
Finished deploy [wikimedia/discovery/analytics@25dad72]: T273847 export queries to relforge dag deployment - elastic index name fix (duration: 00m 09s) |
[production] |
11:57 |
<zpapierski@deploy1002> |
Started deploy [wikimedia/discovery/analytics@25dad72]: T273847 export queries to relforge dag deployment - elastic index name fix |
[production] |
11:50 |
<XioNoX> |
tighten cr3-ulsfo loopback firewall filter - T207799 |
[production] |
11:49 |
<zpapierski@deploy1002> |
Finished deploy [wikimedia/discovery/analytics@25dad72]: T273847 export queries to relforge dag deployment - elastic index name fix (duration: 01m 39s) |
[production] |
11:47 |
<zpapierski@deploy1002> |
Started deploy [wikimedia/discovery/analytics@25dad72]: T273847 export queries to relforge dag deployment - elastic index name fix |
[production] |
11:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: Repool after schema change', diff saved to https://phabricator.wikimedia.org/P15261 and previous config saved to /var/cache/conftool/dbconfig/20210408-114625-root.json |
[production] |
11:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Repool db1118 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15259 and previous config saved to /var/cache/conftool/dbconfig/20210408-112332-root.json |
[production] |
11:09 |
<filippo@cumin1001> |
END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2028.codfw.wmnet |
[production] |
11:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Repool db1118 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15258 and previous config saved to /var/cache/conftool/dbconfig/20210408-110828-root.json |
[production] |
11:07 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: de1670cbd2c59a24f1e29a6d3731e3ac7f39d336: Enable Growth for newcomers on simplewiki, mswiki, tawiki (T278369; T277562; T277550) (duration: 01m 07s) |
[production] |
10:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: Repool db1118 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15257 and previous config saved to /var/cache/conftool/dbconfig/20210408-105324-root.json |
[production] |
10:47 |
<effie> |
disable puppet on parsoid* servers |
[production] |
10:41 |
<XioNoX> |
enable sampling on all routers FPCs |
[production] |
10:40 |
<marostegui> |
Upgrade db2085's kernel |
[production] |
10:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: Repool db1118 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15256 and previous config saved to /var/cache/conftool/dbconfig/20210408-103821-root.json |
[production] |
10:37 |
<mvolz@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' . |
[production] |