2024-06-06
ยง
|
20:19 |
<swfrench@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/data-gateway: apply |
[production] |
20:18 |
<swfrench@deploy1002> |
helmfile [codfw] START helmfile.d/services/data-gateway: apply |
[production] |
20:13 |
<swfrench@deploy1002> |
helmfile [staging] DONE helmfile.d/services/data-gateway: apply |
[production] |
20:13 |
<swfrench@deploy1002> |
helmfile [staging] START helmfile.d/services/data-gateway: apply |
[production] |
20:11 |
<urbanecm@deploy1002> |
wargo and urbanecm and jsn and kgraessle: Continuing with sync |
[production] |
20:08 |
<urbanecm@deploy1002> |
wargo and urbanecm and jsn and kgraessle: Backport for [[gerrit:1031174|Assign applychangetags right to group "all" on plwiktionary (T363638)]], [[gerrit:1038886|InitialiseSettings: Enable AutoModerator on trwiki (T362622)]], [[gerrit:1038388|InitaliseSettings-labs: Deploy Automoderator patroller workstream survey to cawiki (T362969)]] synced to the testservers (https://wikitech.wikimedia.org/wiki |
[production] |
20:06 |
<urbanecm@deploy1002> |
Started scap: Backport for [[gerrit:1031174|Assign applychangetags right to group "all" on plwiktionary (T363638)]], [[gerrit:1038886|InitialiseSettings: Enable AutoModerator on trwiki (T362622)]], [[gerrit:1038388|InitaliseSettings-labs: Deploy Automoderator patroller workstream survey to cawiki (T362969)]] |
[production] |
20:02 |
<ryankemper@cumin2002> |
START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 |
[production] |
19:31 |
<xcollazo@deploy1002> |
Finished deploy [airflow-dags/analytics@a8843e6]: Deploying latest DAGs to the analytics Airflow instance. T358707. (duration: 00m 26s) |
[production] |
19:30 |
<xcollazo@deploy1002> |
Started deploy [airflow-dags/analytics@a8843e6]: Deploying latest DAGs to the analytics Airflow instance. T358707. |
[production] |
18:29 |
<dduvall@deploy1002> |
rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.8 refs T361402 |
[production] |
18:17 |
<thcipriani@deploy1002> |
Finished deploy [releng/jenkins-deploy@3be9893] (releasing): (no justification provided) (duration: 00m 43s) |
[production] |
18:17 |
<thcipriani@deploy1002> |
Started deploy [releng/jenkins-deploy@3be9893] (releasing): (no justification provided) |
[production] |
17:57 |
<kamila@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye |
[production] |
17:57 |
<kamila@cumin1002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - kamila@cumin1002" |
[production] |
17:56 |
<kamila@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - kamila@cumin1002" |
[production] |
17:48 |
<topranks> |
re-enabling pybal on lvs1017 after cable move T366361 |
[production] |
17:31 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1247 (T364069)', diff saved to https://phabricator.wikimedia.org/P64211 and previous config saved to /var/cache/conftool/dbconfig/20240606-173121-marostegui.json |
[production] |
17:31 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance |
[production] |
17:31 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance |
[production] |
17:26 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on lvs1017.eqiad.wmnet with reason: moving lvs1017 link back to ssw1-e1-codfw |
[production] |
17:26 |
<topranks> |
disabling pybal on lvs1017 to move traffic to lvs1020 in advance of cable move T366361 |
[production] |
17:26 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:20:00 on lvs1017.eqiad.wmnet with reason: moving lvs1017 link back to ssw1-e1-codfw |
[production] |
17:23 |
<topranks> |
re-enabling pybal on lvs1018 after cable move T366361 |
[production] |
17:15 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on lvs1018.eqiad.wmnet with reason: moving lvs1018 link back to ssw1-e1-codfw |
[production] |
17:15 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:20:00 on lvs1018.eqiad.wmnet with reason: moving lvs1018 link back to ssw1-e1-codfw |
[production] |
17:15 |
<cmooney@cumin1002> |
END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 0:20:00 on lvs1019.eqiad.wmnet with reason: moving lvs1018 link back to ssw1-e1-codfw |
[production] |
17:14 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:20:00 on lvs1019.eqiad.wmnet with reason: moving lvs1018 link back to ssw1-e1-codfw |
[production] |
17:14 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depooling db1186 (T352010)', diff saved to https://phabricator.wikimedia.org/P64210 and previous config saved to /var/cache/conftool/dbconfig/20240606-171359-ladsgroup.json |
[production] |
17:13 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance |
[production] |
17:13 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance |
[production] |
17:13 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1184 (T352010)', diff saved to https://phabricator.wikimedia.org/P64209 and previous config saved to /var/cache/conftool/dbconfig/20240606-171336-ladsgroup.json |
[production] |
17:11 |
<topranks> |
disabling pybal on lvs1018 to move traffic to lvs1020 in advance of cable move T366361 |
[production] |
17:11 |
<topranks> |
re-enabling pybal on lvs1019 after cable move T366361 |
[production] |
16:58 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P64208 and previous config saved to /var/cache/conftool/dbconfig/20240606-165828-ladsgroup.json |
[production] |
16:52 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on lvs1019.eqiad.wmnet with reason: moving lvs1019 link back to ssw1-f1-codfw |
[production] |
16:51 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:20:00 on lvs1019.eqiad.wmnet with reason: moving lvs1019 link back to ssw1-f1-codfw |
[production] |
16:50 |
<topranks> |
disabling pybal on lvs1019 to move traffic to lvs1020 in advance of cable move T366361 |
[production] |
16:43 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P64207 and previous config saved to /var/cache/conftool/dbconfig/20240606-164320-ladsgroup.json |
[production] |
16:28 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1184 (T352010)', diff saved to https://phabricator.wikimedia.org/P64206 and previous config saved to /var/cache/conftool/dbconfig/20240606-162812-ladsgroup.json |
[production] |
16:28 |
<hashar@deploy1002> |
Finished deploy [integration/docroot@eee90e6]: (no justification provided) (duration: 00m 05s) |
[production] |
16:28 |
<hashar@deploy1002> |
Started deploy [integration/docroot@eee90e6]: (no justification provided) |
[production] |
16:25 |
<dancy@deploy1002> |
Installation of scap version "4.86.1" completed for 285 hosts |
[production] |
16:25 |
<dancy@deploy1002> |
Installing scap version "4.86.1" for 285 hosts |
[production] |
16:24 |
<dancy@deploy1002> |
Installing scap version "4.86.1" for 286 hosts |
[production] |
16:13 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depooling db2130 (T352010)', diff saved to https://phabricator.wikimedia.org/P64205 and previous config saved to /var/cache/conftool/dbconfig/20240606-161338-ladsgroup.json |
[production] |
16:13 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance |
[production] |
16:13 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance |
[production] |
16:13 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2116 (T352010)', diff saved to https://phabricator.wikimedia.org/P64204 and previous config saved to /var/cache/conftool/dbconfig/20240606-161312-ladsgroup.json |
[production] |
16:10 |
<kamila@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: reimage still running |
[production] |