2024-06-25
ยง
|
11:56 |
<vgutierrez> |
disable puppet on A:cp-esams before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1049529 - T364383 |
[production] |
11:55 |
<cgoubert@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply |
[production] |
11:53 |
<cgoubert@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mw-debug: apply |
[production] |
11:45 |
<cgoubert@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mw-debug: apply |
[production] |
11:45 |
<cgoubert@deploy1002> |
helmfile [codfw] START helmfile.d/services/mw-debug: apply |
[production] |
10:40 |
<marostegui> |
m2 dbmaint eqiad Stop db1217:3322 to clone db1228 T368374 |
[production] |
10:12 |
<jmm@deploy1002> |
Finished scap: (no justification provided) (duration: 03m 30s) |
[production] |
10:11 |
<jmm@deploy1002> |
Started scap: (no justification provided) |
[production] |
09:53 |
<cgoubert@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 21 days, 0:00:00 on 11 hosts with reason: Turning down appserver clusters |
[production] |
09:53 |
<cgoubert@cumin1002> |
START - Cookbook sre.hosts.downtime for 21 days, 0:00:00 on 11 hosts with reason: Turning down appserver clusters |
[production] |
09:50 |
<cgoubert@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 21 days, 0:00:00 on 25 hosts with reason: Turning down appserver clusters |
[production] |
09:49 |
<cgoubert@cumin1002> |
START - Cookbook sre.hosts.downtime for 21 days, 0:00:00 on 25 hosts with reason: Turning down appserver clusters |
[production] |
09:44 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db[1217,1228].eqiad.wmnet with reason: Cloning |
[production] |
09:44 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db[1217,1228].eqiad.wmnet with reason: Cloning |
[production] |
09:34 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Remove db1228 from dbctl T368374', diff saved to https://phabricator.wikimedia.org/P65409 and previous config saved to /var/cache/conftool/dbconfig/20240625-093454-marostegui.json |
[production] |
09:34 |
<slyngs> |
Switching idp-test.wikimedia.org to CAS 7 |
[production] |
09:32 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db1228 T368374', diff saved to https://phabricator.wikimedia.org/P65408 and previous config saved to /var/cache/conftool/dbconfig/20240625-093221-root.json |
[production] |
08:45 |
<jynus@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2025.codfw.wmnet with reason: full dump |
[production] |
08:45 |
<jynus@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2025.codfw.wmnet with reason: full dump |
[production] |
08:32 |
<jynus@cumin1002> |
dbctl commit (dc=all): 'Depool es2025', diff saved to https://phabricator.wikimedia.org/P65407 and previous config saved to /var/cache/conftool/dbconfig/20240625-083216-jynus.json |
[production] |
08:31 |
<jynus@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2022.codfw.wmnet with reason: full dump |
[production] |
08:31 |
<jynus@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2022.codfw.wmnet with reason: full dump |
[production] |
08:26 |
<jynus@cumin1002> |
dbctl commit (dc=all): 'Depool es2022', diff saved to https://phabricator.wikimedia.org/P65406 and previous config saved to /var/cache/conftool/dbconfig/20240625-082649-jynus.json |
[production] |
07:36 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dborch1001.wikimedia.org |
[production] |
07:32 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host dborch1001.wikimedia.org |
[production] |
07:19 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance |
[production] |
07:19 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance |
[production] |
07:18 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1167 (T364069)', diff saved to https://phabricator.wikimedia.org/P65405 and previous config saved to /var/cache/conftool/dbconfig/20240625-071855-marostegui.json |
[production] |
07:14 |
<marostegui> |
Optimize pagelinks on old s8 codfw master db2165 dbmaint T364069 |
[production] |
07:06 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Long schema change |
[production] |
07:06 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Long schema change |
[production] |
07:03 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P65404 and previous config saved to /var/cache/conftool/dbconfig/20240625-070348-marostegui.json |
[production] |
07:02 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2165 T368355', diff saved to https://phabricator.wikimedia.org/P65403 and previous config saved to /var/cache/conftool/dbconfig/20240625-070252-marostegui.json |
[production] |
07:01 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Promote db2161 to s8 primary T368355', diff saved to https://phabricator.wikimedia.org/P65402 and previous config saved to /var/cache/conftool/dbconfig/20240625-070127-marostegui.json |
[production] |
07:01 |
<marostegui> |
Starting s8 codfw failover from db2165 to db2161 - T368355 |
[production] |
07:00 |
<arnaudb@deploy1002> |
Finished scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] (duration: 07m 47s) |
[production] |
06:55 |
<arnaudb@deploy1002> |
arnaudb: Continuing with sync |
[production] |
06:55 |
<arnaudb@deploy1002> |
arnaudb: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
06:54 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
06:52 |
<arnaudb@deploy1002> |
Started scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] |
[production] |
06:48 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P65401 and previous config saved to /var/cache/conftool/dbconfig/20240625-064841-marostegui.json |
[production] |
06:45 |
<arnaudb@deploy1002> |
Sync cancelled. |
[production] |
06:45 |
<arnaudb@deploy1002> |
arnaudb: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
06:42 |
<arnaudb@deploy1002> |
Started scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] |
[production] |
06:40 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'T368020', diff saved to https://phabricator.wikimedia.org/P65400 and previous config saved to /var/cache/conftool/dbconfig/20240625-064000-arnaudb.json |
[production] |
06:39 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T368355 |
[production] |
06:39 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set db2161 with weight 0 T368355', diff saved to https://phabricator.wikimedia.org/P65399 and previous config saved to /var/cache/conftool/dbconfig/20240625-063908-root.json |
[production] |
06:38 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T368355 |
[production] |
06:34 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Promote es1039 to es7 primary T368020', diff saved to https://phabricator.wikimedia.org/P65398 and previous config saved to /var/cache/conftool/dbconfig/20240625-063453-arnaudb.json |
[production] |
06:33 |
<arnaudb> |
Starting es7 eqiad failover from es1035 to es1039 - T368020 |
[production] |