2021-07-19
ยง
|
18:27 |
<ryankemper> |
T264053 Deploying fix for timer issue on cloudelastic: `ryankemper@cumin1001:~$ sudo cumin -b 6 'P{cloudelastic*}' 'sudo systemctl stop elasticsearch-disable-readahead.timer && sudo systemctl disable elasticsearch-disable-readahead.timer && rm -fv /etc/systemd/system/elasticsearch-disable-readahead.timer && rm -fv /usr/lib/systemd/system/elasticsearch-disable-readahead.timer && sudo run-puppet-agent'` |
[production] |
18:22 |
<vgutierrez> |
disable puppet & stop pybal on lvs2010 - T286921 |
[production] |
18:20 |
<vgutierrez> |
enabling pybal on lvs2007 - T286921 |
[production] |
18:19 |
<ryankemper> |
T264053 Deploying fix for timer issue: `ryankemper@cumin1001:~$ sudo cumin -b 36 'P{elastic*}' 'sudo systemctl stop elasticsearch-disable-readahead.timer && sudo systemctl disable elasticsearch-disable-readahead.timer && rm -fv /etc/systemd/system/elasticsearch-disable-readahead.timer && rm -fv /usr/lib/systemd/system/elasticsearch-disable-readahead.timer && sudo run-puppet-agent'` |
[production] |
18:14 |
<topranks> |
Running homer to re-enable asw-a2-codfw xe-2/0/45 port [lvs2007] |
[production] |
18:06 |
<dancy@deploy1002> |
Synchronized .pipeline: Config: [[gerrit:705437|pipeline: Perform mergeMessageFileList and rebuildLocalisationCache separately]] (duration: 00m 56s) |
[production] |
17:54 |
<mbsantos@deploy1002> |
Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 22s) |
[production] |
17:54 |
<mbsantos@deploy1002> |
Started deploy [tilerator/deploy@82e5f94]: (no justification provided) |
[production] |
17:53 |
<mbsantos@deploy1002> |
Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 22s) |
[production] |
17:53 |
<mbsantos@deploy1002> |
Started deploy [tilerator/deploy@82e5f94]: (no justification provided) |
[production] |
17:53 |
<mbsantos@deploy1002> |
Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 21s) |
[production] |
17:53 |
<mbsantos@deploy1002> |
Started deploy [tilerator/deploy@82e5f94]: (no justification provided) |
[production] |
17:52 |
<mbsantos@deploy1002> |
Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 15s) |
[production] |
17:52 |
<mbsantos@deploy1002> |
Started deploy [tilerator/deploy@82e5f94]: (no justification provided) |
[production] |
17:52 |
<mbsantos@deploy1002> |
Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 16s) |
[production] |
17:51 |
<mbsantos@deploy1002> |
Started deploy [tilerator/deploy@82e5f94]: (no justification provided) |
[production] |
17:42 |
<ryankemper> |
[Elastic] Noted `Jul 16 18:31:20 elastic2038 elasticsearch[957]: 2021-07-16 18:31:20,657 main ERROR Unknown GELF server hostname:udp:logstash.svc.eqiad.wmnet` in elasticsearch service logs (unit had been running for 2 days) thus the restart of the elasticsearch service |
[production] |
17:41 |
<ryankemper> |
[Elastic] Restarted elasticsearch services on `elastic2038`; afterwards restarted prometheus exporters; no units failed any longer |
[production] |
17:30 |
<volans> |
running puppet on elastic2038 after nework was restored |
[production] |
17:26 |
<mbsantos@deploy1002> |
Finished deploy [kartotherian/deploy@978b674]: (no justification provided) (duration: 00m 14s) |
[production] |
17:26 |
<mbsantos@deploy1002> |
Started deploy [kartotherian/deploy@978b674]: (no justification provided) |
[production] |
17:26 |
<mbsantos@deploy1002> |
Finished deploy [kartotherian/deploy@978b674]: (no justification provided) (duration: 00m 16s) |
[production] |
17:25 |
<mbsantos@deploy1002> |
Started deploy [kartotherian/deploy@978b674]: (no justification provided) |
[production] |
17:25 |
<mbsantos@deploy1002> |
Finished deploy [kartotherian/deploy@978b674]: (no justification provided) (duration: 00m 21s) |
[production] |
17:25 |
<mbsantos@deploy1002> |
Started deploy [kartotherian/deploy@978b674]: (no justification provided) |
[production] |
17:24 |
<mbsantos@deploy1002> |
Finished deploy [kartotherian/deploy@978b674]: (no justification provided) (duration: 00m 21s) |
[production] |
17:24 |
<mbsantos@deploy1002> |
Started deploy [kartotherian/deploy@978b674]: (no justification provided) |
[production] |
17:23 |
<mbsantos@deploy1002> |
Finished deploy [kartotherian/deploy@978b674]: (no justification provided) (duration: 00m 21s) |
[production] |
17:23 |
<volans> |
running authdns-update to force-update authdns2001 |
[production] |
17:23 |
<mbsantos@deploy1002> |
Started deploy [kartotherian/deploy@978b674]: (no justification provided) |
[production] |
17:23 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:21 |
<XioNoX> |
remove ns1 redirect - T286787 |
[production] |
17:19 |
<volans@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
17:17 |
<volans@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:14 |
<volans@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
17:13 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw[1286-1287].eqiad.wmnet |
[production] |
17:10 |
<XioNoX> |
enable asw-a2-codfw access ports - T286787 |
[production] |
17:04 |
<XioNoX> |
enable cr1-codfw / et-0/0/0 - T286787 |
[production] |
16:54 |
<brennen> |
gerrit up and running with manual configuration edit to use ipv4 address |
[production] |
16:51 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=logstash2021.codfw.wmnet |
[production] |
16:51 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts mw[1286-1287].eqiad.wmnet |
[production] |
16:46 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw1284.eqiad.wmnet |
[production] |
16:40 |
<dancy@deploy1002> |
Finished deploy [gerrit/gerrit@4f29981]: Gerrit to 3.2.11 on gerrit1001 (duration: 00m 08s) |
[production] |
16:40 |
<hashar> |
Upgrading gerrit1001 with dancy & brennen |
[production] |
16:40 |
<dancy@deploy1002> |
Started deploy [gerrit/gerrit@4f29981]: Gerrit to 3.2.11 on gerrit1001 |
[production] |
16:40 |
<XioNoX> |
update asw-a2-codfw serial number - T286787 |
[production] |
16:39 |
<dcausse@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . |
[production] |
16:33 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts mw1284.eqiad.wmnet |
[production] |
16:31 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1001.wikimedia.org with reason: maintenance |
[production] |
16:31 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit1001.wikimedia.org with reason: maintenance |
[production] |