2022-06-07
ยง
|
18:46 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
18:40 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
18:40 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
18:33 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
18:08 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
18:07 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
18:07 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
18:06 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
18:06 |
<dduvall@deploy1002> |
rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.15 refs T308068 |
[production] |
18:01 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
17:58 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
17:58 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
17:51 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
17:45 |
<dduvall@deploy1002> |
Pruned MediaWiki: 1.39.0-wmf.13 (duration: 01m 49s) |
[production] |
17:43 |
<dduvall@deploy1002> |
Finished scap: testwikis wikis to 1.39.0-wmf.15 refs T308068 (duration: 30m 22s) |
[production] |
17:37 |
<pt1979@cumin1001> |
START - Cookbook sre.hosts.dhcp for host clouddumps1001.wikimedia.org |
[production] |
17:36 |
<pt1979@cumin1001> |
END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host clouddumps1001.wikimedia.org |
[production] |
17:36 |
<pt1979@cumin1001> |
START - Cookbook sre.hosts.dhcp for host clouddumps1001.wikimedia.org |
[production] |
17:29 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM webperf2004.codfw.wmnet |
[production] |
17:21 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.reboot-vm for VM webperf2004.codfw.wmnet |
[production] |
17:13 |
<dduvall@deploy1002> |
Started scap: testwikis wikis to 1.39.0-wmf.15 refs T308068 |
[production] |
16:56 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
16:50 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
16:50 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
16:43 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
16:32 |
<dduvall> |
scap deploy-promote testwikis failed at invocation of logstash_checker.py ("logstash_checker.py: error: argument --delay: invalid int value: '40.406498670578'") T308068 |
[production] |
16:21 |
<dduvall@deploy1002> |
scap failed: RuntimeError scap failed: average error rate on 8/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details) (duration: 14m 27s) |
[production] |
16:21 |
<dduvall@deploy1002> |
scap failed: average error rate on 8/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details) |
[production] |
16:18 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
16:17 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
16:17 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
16:16 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
16:11 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
16:08 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
16:08 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
16:06 |
<dduvall@deploy1002> |
Started scap: testwikis wikis to 1.39.0-wmf.15 refs T308068 |
[production] |
16:06 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
15:08 |
<volans@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
14:57 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster |
[production] |
14:52 |
<volans> |
upgrading spicerack to v2.6.0 on cumin2002 |
[production] |
14:50 |
<volans> |
uploaded spicerack_2.6.0 to apt.wikimedia.org bullseye-wikimedia |
[production] |
14:48 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons. |
[production] |
14:45 |
<moritzm> |
adding additional disk for /srv to webperf2004 T305460 |
[production] |
14:43 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage |
[production] |
14:41 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage |
[production] |
14:30 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster |
[production] |
14:02 |
<ayounsi@cumin1001> |
END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) |
[production] |
14:02 |
<ayounsi@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
14:02 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox-exports.discovery.wmnet on all recursors |
[production] |
14:02 |
<jbond@cumin1001> |
START - Cookbook sre.dns.wipe-cache netbox-exports.discovery.wmnet on all recursors |
[production] |