2021-08-04
ยง
|
16:23 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2355.codfw.wmnet with reason: REIMAGE |
[production] |
16:23 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw2357.codfw.wmnet with reason: reimage |
[production] |
16:22 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on mw2357.codfw.wmnet with reason: reimage |
[production] |
16:22 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw2353.codfw.wmnet with reason: reimage |
[production] |
16:22 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on mw2353.codfw.wmnet with reason: reimage |
[production] |
16:21 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on mw2353.codfw.wmnet with reason: reimage |
[production] |
16:21 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on mw2353.codfw.wmnet with reason: reimage |
[production] |
16:21 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2353.codfw.wmnet with reason: REIMAGE |
[production] |
16:21 |
<joe> |
find . -type f -delete on /var/cache/nginx-docker-registry on registry2*, the disk is too small for unbound cache *and* accepting large uploads |
[production] |
16:20 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2355.codfw.wmnet with reason: REIMAGE |
[production] |
16:19 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2351.codfw.wmnet with reason: REIMAGE |
[production] |
16:18 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2353.codfw.wmnet with reason: REIMAGE |
[production] |
16:16 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2351.codfw.wmnet with reason: REIMAGE |
[production] |
16:15 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps1008.eqiad.wmnet with reason: Rebuilding as buster replica of maps1009 |
[production] |
16:15 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps1008.eqiad.wmnet with reason: Rebuilding as buster replica of maps1009 |
[production] |
16:14 |
<hnowlan> |
draining maps1008 from cassandra cluster |
[production] |
16:13 |
<hnowlan@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=maps1008.eqiad.wmnet |
[production] |
16:02 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw2357.codfw.wmnet with reason: reimage |
[production] |
16:02 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on mw2357.codfw.wmnet with reason: reimage |
[production] |
16:01 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw2380.codfw.wmnet with reason: reimage |
[production] |
16:01 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on mw2380.codfw.wmnet with reason: reimage |
[production] |
16:01 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw[2377-2379].codfw.wmnet with reason: reimage |
[production] |
16:01 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on mw[2377-2379].codfw.wmnet with reason: reimage |
[production] |
15:58 |
<mutante> |
mw2351, mw2353, mw2355, mw2357 - converting from appserver to jobrunner, mw2377, mw2378, mw2379, mw2380 - converting from jobrunner to appserver - for balancing of server types over rows |
[production] |
15:51 |
<dzahn@cumin1001> |
conftool action : set/pooled=inactive; selector: name=mw2380.codfw.wmnet |
[production] |
15:50 |
<dzahn@cumin1001> |
conftool action : set/pooled=inactive; selector: name=mw237[789].codfw.wmnet |
[production] |
15:48 |
<dzahn@cumin1001> |
conftool action : set/pooled=inactive; selector: name=mw235[1357].codfw.wmnet |
[production] |
15:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=inactive; selector: name=mw235[1357].wmnet |
[production] |
14:30 |
<godog> |
upgrade prometheus on cloudmetrics hosts - T222113 |
[production] |
14:28 |
<godog> |
upgrade prometheus on prometheus4001 - T222113 |
[production] |
14:19 |
<moritzm> |
imported gitlab-ce 13.12.9 to thirdparty/gitlab T287671 |
[production] |
14:18 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
14:17 |
<godog> |
depool prometheus2004 and pool prometheus2003 - T222113 |
[production] |
14:13 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 18 hosts with reason: Firmware upgrade on db1104 (s8 primary) T286226 |
[production] |
14:13 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 18 hosts with reason: Firmware upgrade on db1104 (s8 primary) T286226 |
[production] |
14:12 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
14:02 |
<dcausse@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . |
[production] |
13:55 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
13:50 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 5d7255c1127f951da59b9b48749fe9cf59e11930: jvwikisource: Add author namespace (T286241) (duration: 01m 06s) |
[production] |
13:49 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
13:32 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
13:21 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
13:19 |
<urbanecm> |
jvwikisource was created (T286241) |
[production] |
13:19 |
<urbanecm@deploy1002> |
Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 03m 11s) |
[production] |
13:18 |
<volans> |
upgraded python3-wmflib to v0.0.9 fleet wide |
[production] |
13:15 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Creating jvwikisource (T286241) (duration: 01m 06s) |
[production] |
13:14 |
<urbanecm@deploy1002> |
Synchronized wmf-config/logos.php: Creating jvwikisource (T286241) (duration: 01m 06s) |
[production] |
13:10 |
<urbanecm@deploy1002> |
Synchronized static/images/project-logos/: Creating jvwikisource (T286241) (duration: 01m 07s) |
[production] |
13:09 |
<urbanecm@deploy1002> |
rebuilt and synchronized wikiversions files: Creating jvwikisource (T286241) |
[production] |
13:08 |
<urbanecm@deploy1002> |
Synchronized dblists: Creating jvwikisource (T286241) (duration: 01m 07s) |
[production] |