2024-07-16
ยง
|
15:07 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1158 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P66638 and previous config saved to /var/cache/conftool/dbconfig/20240716-150704-root.json |
[production] |
15:06 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-f2-eqiad,lsw1-f2-eqiad IPv6,ssw1-e1-eqiad.mgmt,ssw1-f1-eqiad.mgmt with reason: JunOS upgrade lsw1-f2-eqiad |
[production] |
15:06 |
<brennen@deploy1002> |
Finished deploy [phabricator/deployment@7335128]: deploy phab1004 for T370109 (duration: 00m 52s) |
[production] |
15:05 |
<godog> |
silence OtelCollectorRefusedSpans in codfw for 7d - T370043 |
[production] |
15:05 |
<godog> |
silence OtelCollectorRefusedSpans in codfw for 7d |
[production] |
15:05 |
<brennen@deploy1002> |
Started deploy [phabricator/deployment@7335128]: deploy phab1004 for T370109 |
[production] |
15:04 |
<brennen@deploy1002> |
Finished deploy [phabricator/deployment@7335128]: test deploy phab2002 for T370109 (duration: 00m 34s) |
[production] |
15:04 |
<jelto@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab.wmfusercontent.org with reason: Phabricator/Phorge update |
[production] |
15:04 |
<jelto@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on phab.wmfusercontent.org with reason: Phabricator/Phorge update |
[production] |
15:04 |
<brennen@deploy1002> |
Started deploy [phabricator/deployment@7335128]: test deploy phab2002 for T370109 |
[production] |
15:02 |
<jelto@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update |
[production] |
15:02 |
<jelto@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update |
[production] |
15:02 |
<jelto@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update |
[production] |
15:02 |
<jelto@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update |
[production] |
15:01 |
<urbanecm@deploy1002> |
Finished scap: Backport for [[gerrit:1054572|Introduce Vanish Request Flow (T367329 T367726 T367728 T367729 T367744 T368177 T368285 T368368 T368372 T368611 T369489)]], [[gerrit:1054573|Pass wiki id to actor store for cross-db hasPublicLogs query (T370059)]], [[gerrit:1054574|Properly set automatic vanish performer on GlobalRenameUser (T368177)]], [[gerrit:1053373|Enable account vanishing in Centra |
[production] |
15:00 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P66637 and previous config saved to /var/cache/conftool/dbconfig/20240716-150007-arnaudb.json |
[production] |
14:53 |
<urbanecm@deploy1002> |
dbrant, urbanecm: Continuing with sync |
[production] |
14:53 |
<urbanecm@deploy1002> |
dbrant, urbanecm: Backport for [[gerrit:1054572|Introduce Vanish Request Flow (T367329 T367726 T367728 T367729 T367744 T368177 T368285 T368368 T368372 T368611 T369489)]], [[gerrit:1054573|Pass wiki id to actor store for cross-db hasPublicLogs query (T370059)]], [[gerrit:1054574|Properly set automatic vanish performer on GlobalRenameUser (T368177)]], [[gerrit:1053373|Enable account vanishing in Cen |
[production] |
14:53 |
<filippo@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on centrallog2002.codfw.wmnet with reason: network upgrade |
[production] |
14:53 |
<filippo@cumin1002> |
START - Cookbook sre.hosts.downtime for 3:00:00 on centrallog2002.codfw.wmnet with reason: network upgrade |
[production] |
14:51 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1158 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P66636 and previous config saved to /var/cache/conftool/dbconfig/20240716-145159-root.json |
[production] |
14:49 |
<sukhe> |
[durum1001] upgrade anycast-healthchecker to 0.9.8-1+wmf12u1: T370068 |
[production] |
14:46 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:50:00 on lsw1-f2-eqiad.mgmt with reason: prep JunOS upgrade lsw1-f2-eqiad |
[production] |
14:46 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:50:00 on lsw1-f2-eqiad.mgmt with reason: prep JunOS upgrade lsw1-f2-eqiad |
[production] |
14:45 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P66635 and previous config saved to /var/cache/conftool/dbconfig/20240716-144500-arnaudb.json |
[production] |
14:44 |
<sukhe> |
reprepro -C main include bookworm-wikimedia anycast-healthchecker_0.9.8-1+wmf12u1_amd64.changes: T370068 |
[production] |
14:36 |
<cgoubert@cumin1002> |
conftool action : set/pooled=inactive; selector: name=(kubernetes1062.eqiad.wmnet|mw1494.eqiad.wmnet|mw1495.eqiad.wmnet),cluster=kubernetes,service=kubesvc |
[production] |
14:34 |
<claime> |
Cordoning kubernetes1062.eqiad.wmnet mw1494.eqiad.wmnet mw1495.eqiad.wmnet - T365997 |
[production] |
14:33 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[1194,1200-1201].eqiad.wmnet,dbstore1009.eqiad.wmnet with reason: T365997 |
[production] |
14:33 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db[1194,1200-1201].eqiad.wmnet,dbstore1009.eqiad.wmnet with reason: T365997 |
[production] |
14:33 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'T365997 - depool db1194-s7,db1200-s5,db1201-s6', diff saved to https://phabricator.wikimedia.org/P66634 and previous config saved to /var/cache/conftool/dbconfig/20240716-143306-arnaudb.json |
[production] |
14:29 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1212 (T367781)', diff saved to https://phabricator.wikimedia.org/P66633 and previous config saved to /var/cache/conftool/dbconfig/20240716-142953-arnaudb.json |
[production] |
14:25 |
<urbanecm@deploy1002> |
Started scap sync-world: Backport for [[gerrit:1054572|Introduce Vanish Request Flow (T367329 T367726 T367728 T367729 T367744 T368177 T368285 T368368 T368372 T368611 T369489)]], [[gerrit:1054573|Pass wiki id to actor store for cross-db hasPublicLogs query (T370059)]], [[gerrit:1054574|Properly set automatic vanish performer on GlobalRenameUser (T368177)]], [[gerrit:1053373|Enable account vanishing |
[production] |
14:23 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depooling db1212 (T367781)', diff saved to https://phabricator.wikimedia.org/P66632 and previous config saved to /var/cache/conftool/dbconfig/20240716-142321-arnaudb.json |
[production] |
14:23 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
14:22 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
14:22 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1212.eqiad.wmnet with reason: Maintenance |
[production] |
14:22 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1212.eqiad.wmnet with reason: Maintenance |
[production] |
14:20 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1198 (T367781)', diff saved to https://phabricator.wikimedia.org/P66631 and previous config saved to /var/cache/conftool/dbconfig/20240716-142029-arnaudb.json |
[production] |
14:12 |
<jiji@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply |
[production] |
14:11 |
<jiji@deploy1002> |
helmfile [codfw] START helmfile.d/services/mw-api-int: apply |
[production] |
14:10 |
<jiji@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply |
[production] |
14:08 |
<jiji@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mw-api-int: apply |
[production] |
14:07 |
<jiji@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply |
[production] |
14:07 |
<jiji@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mw-debug: apply |
[production] |
14:05 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P66630 and previous config saved to /var/cache/conftool/dbconfig/20240716-140522-arnaudb.json |
[production] |
14:03 |
<cgoubert@cumin1002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2432.codfw.wmnet |
[production] |
13:53 |
<cgoubert@cumin1002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2432.codfw.wmnet |
[production] |
13:50 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P66629 and previous config saved to /var/cache/conftool/dbconfig/20240716-135015-arnaudb.json |
[production] |
13:40 |
<tgr|away> |
UTC afternoon deploys done |
[production] |