2022-08-05
§
|
09:03 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance |
[production] |
00:53 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on gerrit2001.wikimedia.org with reason: decom, replaced by gerrit2002 |
[production] |
00:53 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 8 days, 0:00:00 on gerrit2001.wikimedia.org with reason: decom, replaced by gerrit2002 |
[production] |
00:53 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit2002.wikimedia.org |
[production] |
00:53 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.remove-downtime for gerrit2002.wikimedia.org |
[production] |
00:52 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on gerrit2002.wikimedia.org with reason: decom, replaced by gerrit2002 |
[production] |
00:52 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 8 days, 0:00:00 on gerrit2002.wikimedia.org with reason: decom, replaced by gerrit2002 |
[production] |
00:18 |
<mutante> |
restarting gerrit for config change - removing old replica T313250 |
[production] |
2022-08-04
§
|
23:06 |
<mutante> |
switching gerrit-replica.wikimedia.org to new machine gerrit2002, dropping gerrit-replica-new.wikimedia.org T313250 |
[production] |
21:07 |
<ryankemper@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:59 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
20:57 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
20:57 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
20:56 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
20:56 |
<thcipriani@deploy1002> |
Finished scap: Backport for [[gerrit:819774]] tkwiki: Update wordmark (duration: 06m 12s) |
[production] |
20:51 |
<ryankemper@deploy1002> |
helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:51 |
<ryankemper@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:51 |
<ryankemper@deploy1002> |
helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:50 |
<thcipriani@deploy1002> |
Started scap: Backport for [[gerrit:819774]] tkwiki: Update wordmark |
[production] |
20:48 |
<thcipriani@deploy1002> |
Finished scap: Backport for [[gerrit:812391]] [config]: Add click event logging for mobile and desktop (duration: 39m 16s) |
[production] |
20:45 |
<ryankemper@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:24 |
<ryankemper@deploy1002> |
helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:23 |
<ryankemper@deploy1002> |
helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:22 |
<ryankemper@deploy1002> |
helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:16 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
20:15 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
20:15 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
20:14 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
20:13 |
<ryankemper@deploy1002> |
helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:13 |
<ryankemper@deploy1002> |
helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:10 |
<ryankemper@deploy1002> |
helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:09 |
<ryankemper@deploy1002> |
helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
20:08 |
<thcipriani@deploy1002> |
Started scap: Backport for [[gerrit:812391]] [config]: Add click event logging for mobile and desktop |
[production] |
19:59 |
<ryankemper@deploy1002> |
helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
19:55 |
<dancy@deploy1002> |
rebuilt and synchronized wikiversions files: resync |
[production] |
19:49 |
<mvernon@cumin1001> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for thanos-be2001.codfw.wmnet |
[production] |
19:49 |
<mvernon@cumin1001> |
START - Cookbook sre.hosts.remove-downtime for thanos-be2001.codfw.wmnet |
[production] |
19:44 |
<mvernon@cumin1001> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 8 hosts |
[production] |
19:44 |
<mvernon@cumin1001> |
START - Cookbook sre.hosts.remove-downtime for 8 hosts |
[production] |
19:42 |
<Emperor> |
rebooting thanos-be2001 to fix drive ordering |
[production] |
19:37 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for elastic2071.codfw.wmnet |
[production] |
19:37 |
<bking@cumin1001> |
START - Cookbook sre.hosts.remove-downtime for elastic2071.codfw.wmnet |
[production] |
19:31 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2071.codfw.wmnet with reason: T310146 |
[production] |
19:31 |
<bking@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2071.codfw.wmnet with reason: T310146 |
[production] |
19:13 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
19:12 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
19:12 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
19:12 |
<ryankemper@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/changeprop: apply |
[production] |
19:11 |
<ryankemper@deploy1002> |
helmfile [eqiad] START helmfile.d/services/changeprop: apply |
[production] |
19:11 |
<dancy> |
There were many errors during php-fpm restart due to failure to contact http://lvs2009:9090/pools/appservers-https_443/mw2361.codfw.wmnet and the like. |
[production] |