2023-11-02
ยง
|
15:57 |
<otto@deploy2002> |
helmfile [staging] START helmfile.d/services/eventgate-analytics: apply |
[production] |
15:51 |
<ottomata> |
eventgate-analytics in eqiad: setting service-runner num_workers: 0 to run with one process and reduce # of threads used by container processes. Should reduce throttling and perhaps help with latency. If works, will make this the default in the chart. - T347477 |
[production] |
15:50 |
<otto@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply |
[production] |
15:50 |
<otto@deploy2002> |
helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply |
[production] |
15:48 |
<sukhe> |
sudo cumin 'O:prometheus' 'run-puppet-agent' |
[production] |
15:45 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough |
[production] |
15:40 |
<fabfur> |
cp4037 repooling with changes for dedicated healthcheck backend (haproxy): https://gerrit.wikimedia.org/r/c/operations/puppet/+/966221/ (T348851) |
[production] |
15:34 |
<otto@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply |
[production] |
15:34 |
<otto@deploy2002> |
helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply |
[production] |
15:27 |
<otto@deploy2002> |
helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply |
[production] |
15:26 |
<otto@deploy2002> |
helmfile [staging] START helmfile.d/services/eventgate-analytics: apply |
[production] |
15:17 |
<fabfur> |
cp4037 depooled to be used as canary for https://gerrit.wikimedia.org/r/c/operations/puppet/+/966221/ |
[production] |
15:02 |
<sukhe@cumin2002> |
START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough and A:wikidough |
[production] |
14:56 |
<herron> |
logstash1025 systemctl restart apache2.service T350402 |
[production] |
14:51 |
<sukhe> |
force agent run on A:wikidough |
[production] |
14:45 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: netbox::standalone |
[production] |
14:35 |
<jbond@cumin1001> |
START - Cookbook sre.puppet.migrate-role for role: netbox::standalone |
[production] |
14:32 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: installserver |
[production] |
14:32 |
<hashar> |
Restarting CI Jenkins again for plugins removal |
[production] |
14:15 |
<hashar> |
Restarting CI Jenkins for plugins adjustements |
[production] |
13:50 |
<jbond@cumin1001> |
START - Cookbook sre.puppet.migrate-role for role: installserver |
[production] |
13:43 |
<jayme@deploy2002> |
Finished scap: upgrading ICU67 (duration: 15m 10s) |
[production] |
13:42 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host install6002.wikimedia.org |
[production] |
13:34 |
<sukhe> |
restart pybal on lvs1020 |
[production] |
13:29 |
<jbond@cumin1001> |
START - Cookbook sre.puppet.migrate-host for host install6002.wikimedia.org |
[production] |
13:27 |
<jayme@deploy2002> |
Started scap: upgrading ICU67 |
[production] |
13:27 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: netinsights |
[production] |
13:14 |
<jbond@cumin1001> |
START - Cookbook sre.puppet.migrate-role for role: netinsights |
[production] |
12:59 |
<moritzm> |
upgrading deployment servers to ICU67 T345561 |
[production] |
12:46 |
<jayme> |
running fleet wide php upgrades - T345561 |
[production] |
12:46 |
<jmm@cumin2002> |
END (FAIL) - Cookbook sre.puppet.migrate-role (exit_code=99) for role: ganeti |
[production] |
12:43 |
<daniel@deploy2002> |
Finished scap: Backport for [[gerrit:970764|ParsoidHandler: emit relative URLs in redirects (T350219 T349001)]] (duration: 21m 37s) |
[production] |
12:38 |
<moritzm> |
upgrading snapshot* to ICU67 T345561 |
[production] |
12:37 |
<daniel@deploy2002> |
daniel: Continuing with sync |
[production] |
12:36 |
<daniel@deploy2002> |
daniel: Backport for [[gerrit:970764|ParsoidHandler: emit relative URLs in redirects (T350219 T349001)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
12:31 |
<moritzm> |
upgrading cloudweb to ICU67 T345561 |
[production] |
12:21 |
<daniel@deploy2002> |
Started scap: Backport for [[gerrit:970764|ParsoidHandler: emit relative URLs in redirects (T350219 T349001)]] |
[production] |
12:20 |
<fnegri@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1006.eqiad.wmnet with OS bookworm |
[production] |
12:04 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-role for role: ganeti |
[production] |
11:58 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host netflow6001.drmrs.wmnet |
[production] |
11:54 |
<hnowlan@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply |
[production] |
11:53 |
<hnowlan@deploy2002> |
helmfile [codfw] START helmfile.d/services/rest-gateway: apply |
[production] |
11:53 |
<hnowlan@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply |
[production] |
11:53 |
<hnowlan@deploy2002> |
helmfile [eqiad] START helmfile.d/services/rest-gateway: apply |
[production] |
11:51 |
<hnowlan@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply |
[production] |
11:51 |
<hnowlan@deploy2002> |
helmfile [eqiad] START helmfile.d/services/rest-gateway: apply |
[production] |
11:50 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-ulsfo and not P{cp4037.ulsfo.wmnet} and A:cp |
[production] |
11:49 |
<hnowlan@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply |
[production] |
11:49 |
<hnowlan@deploy2002> |
helmfile [codfw] START helmfile.d/services/rest-gateway: apply |
[production] |
11:49 |
<fnegri@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1006.eqiad.wmnet with reason: host reimage |
[production] |