2025-06-04
ยง
|
13:50 |
<cgoubert@deploy1003> |
helmfile [eqiad] START helmfile.d/services/mw-cron: apply |
[production] |
13:49 |
<sukhe> |
sudo cumin -b1 -s15 'A:cp' 'run-puppet-agent --enable "merging CR 1114074"': T288106 |
[production] |
13:48 |
<vgutierrez@cumin1002> |
END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs1013.eqiad.wmnet} and A:liberica |
[production] |
13:48 |
<vgutierrez@cumin1002> |
START - Cookbook sre.loadbalancer.admin config_reloading P{lvs1013.eqiad.wmnet} and A:liberica |
[production] |
13:47 |
<tappof@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet |
[production] |
13:46 |
<sukhe> |
forcing ats-backend-restart on cp1104 |
[production] |
13:43 |
<jclark@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1095.eqiad.wmnet with reason: host reimage |
[production] |
13:43 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2217 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P77061 and previous config saved to /var/cache/conftool/dbconfig/20250604-134336-root.json |
[production] |
13:43 |
<tappof@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet |
[production] |
13:41 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P77060 and previous config saved to /var/cache/conftool/dbconfig/20250604-134158-fceratto.json |
[production] |
13:41 |
<tappof@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host grafana2001.codfw.wmnet |
[production] |
13:41 |
<tappof@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet |
[production] |
13:40 |
<tappof@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host grafana2001.codfw.wmnet |
[production] |
13:40 |
<tappof@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet |
[production] |
13:40 |
<samtar@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1153623|IS: Undo turning on wgTemplateDataEnableCategoryBrowser for mw.org (T377975)]] (duration: 09m 57s) |
[production] |
13:39 |
<jclark@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1095.eqiad.wmnet with reason: host reimage |
[production] |
13:38 |
<tappof@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host grafana2001.codfw.wmnet |
[production] |
13:38 |
<tappof@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet |
[production] |
13:37 |
<sukhe> |
forcing agent run on cp2037 (non-single BE node): CR 1114074 |
[production] |
13:37 |
<jclark@cumin1002> |
START - Cookbook sre.hosts.reimage for host ms-be1094.eqiad.wmnet with OS bullseye |
[production] |
13:33 |
<samtar@deploy1003> |
samtar: Continuing with sync |
[production] |
13:32 |
<samtar@deploy1003> |
samtar: Backport for [[gerrit:1153623|IS: Undo turning on wgTemplateDataEnableCategoryBrowser for mw.org (T377975)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
13:30 |
<sukhe> |
forcing agent run on cp7001 (single BE node): CR 1114074 |
[production] |
13:30 |
<samtar@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1153623|IS: Undo turning on wgTemplateDataEnableCategoryBrowser for mw.org (T377975)]] |
[production] |
13:29 |
<sukhe> |
forcing agent run on cp6015: CR 1114074 |
[production] |
13:28 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2217 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P77059 and previous config saved to /var/cache/conftool/dbconfig/20250604-132829-root.json |
[production] |
13:27 |
<vgutierrez@cumin1002> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1013.eqiad.wmnet |
[production] |
13:27 |
<vgutierrez@cumin1002> |
START - Cookbook sre.hosts.remove-downtime for lvs1013.eqiad.wmnet |
[production] |
13:26 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1209 (T395241)', diff saved to https://phabricator.wikimedia.org/P77058 and previous config saved to /var/cache/conftool/dbconfig/20250604-132648-fceratto.json |
[production] |
13:23 |
<sukhe> |
starting removal of ats-be service from eqiad, eqsin, esams, magru, ulsfo: T288106 |
[production] |
13:21 |
<sukhe> |
sudo cumin 'A:cp' 'disable-puppet "merging CR 1114074"' |
[production] |
13:20 |
<jclark@cumin1002> |
START - Cookbook sre.hosts.reimage for host ms-be1095.eqiad.wmnet with OS bullseye |
[production] |
13:18 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db1209 (T395241)', diff saved to https://phabricator.wikimedia.org/P77057 and previous config saved to /var/cache/conftool/dbconfig/20250604-131852-fceratto.json |
[production] |
13:18 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1209.eqiad.wmnet with reason: Maintenance |
[production] |
13:18 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1203 (T395241)', diff saved to https://phabricator.wikimedia.org/P77056 and previous config saved to /var/cache/conftool/dbconfig/20250604-131827-fceratto.json |
[production] |
13:14 |
<jforrester@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1146628|release CampaignEvents to cbk-zam wiki (T393604)]], [[gerrit:1153385|Bump portals to the 2025-06-02 09:23:11+00:00 build (T128546)]], [[gerrit:1151781|build: Rename the rarely-used 'typos' script to 'checkTypos']], [[gerrit:1151751|Drop Chart roll-out dblists, no longer needed (T383079)]] (duration: 10m 29s) |
[production] |
13:13 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2217 (re)pooling @ 60%: Repooling', diff saved to https://phabricator.wikimedia.org/P77055 and previous config saved to /var/cache/conftool/dbconfig/20250604-131323-root.json |
[production] |
13:11 |
<jiji@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply |
[production] |
13:11 |
<jiji@deploy1003> |
helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply |
[production] |
13:08 |
<jiji@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply |
[production] |
13:07 |
<jforrester@deploy1003> |
jforrester, mhorsey: Continuing with sync |
[production] |
13:06 |
<jforrester@deploy1003> |
jforrester, mhorsey: Backport for [[gerrit:1146628|release CampaignEvents to cbk-zam wiki (T393604)]], [[gerrit:1153385|Bump portals to the 2025-06-02 09:23:11+00:00 build (T128546)]], [[gerrit:1151781|build: Rename the rarely-used 'typos' script to 'checkTypos']], [[gerrit:1151751|Drop Chart roll-out dblists, no longer needed (T383079)]] synced to the testservers (see https://wikitech.wikimedia |
[production] |
13:04 |
<jmm@cumin1003> |
END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti7001.magru.wmnet to cluster magru03 and group B |
[production] |
13:04 |
<jforrester@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1146628|release CampaignEvents to cbk-zam wiki (T393604)]], [[gerrit:1153385|Bump portals to the 2025-06-02 09:23:11+00:00 build (T128546)]], [[gerrit:1151781|build: Rename the rarely-used 'typos' script to 'checkTypos']], [[gerrit:1151751|Drop Chart roll-out dblists, no longer needed (T383079)]] |
[production] |
13:03 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P77054 and previous config saved to /var/cache/conftool/dbconfig/20250604-130319-fceratto.json |
[production] |
13:03 |
<jmm@cumin1003> |
START - Cookbook sre.ganeti.addnode for new host ganeti7001.magru.wmnet to cluster magru03 and group B |
[production] |
13:02 |
<sbassett@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/miscweb: apply |
[production] |
13:02 |
<sbassett@deploy1003> |
helmfile [eqiad] START helmfile.d/services/miscweb: apply |
[production] |
13:02 |
<sbassett@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/miscweb: apply |
[production] |
13:01 |
<sbassett@deploy1003> |
helmfile [codfw] START helmfile.d/services/miscweb: apply |
[production] |