2024-06-25
§
|
17:02 |
<andrew@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt2004-dev.codfw.wmnet with reason: host reimage |
[production] |
17:01 |
<eevans@cumin1002> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Apply Cassandra upgrade to 4.1.5 — T354970 - eevans@cumin1002 |
[production] |
16:54 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P65421 and previous config saved to /var/cache/conftool/dbconfig/20240625-165426-marostegui.json |
[production] |
16:49 |
<eevans@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Apply Cassandra upgrade to 4.1.5 — T354970 - eevans@cumin1002 |
[production] |
16:43 |
<andrew@cumin1002> |
START - Cookbook sre.hosts.reimage for host cloudvirt2004-dev.codfw.wmnet with OS bookworm |
[production] |
16:39 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1172 (T364069)', diff saved to https://phabricator.wikimedia.org/P65420 and previous config saved to /var/cache/conftool/dbconfig/20240625-163919-marostegui.json |
[production] |
16:37 |
<eevans@cumin1002> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Apply Cassandra upgrade to 4.1.5 — T354970 - eevans@cumin1002 |
[production] |
16:33 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1035 (re)pooling @ 100%: post T365986 repool', diff saved to https://phabricator.wikimedia.org/P65419 and previous config saved to /var/cache/conftool/dbconfig/20240625-163330-arnaudb.json |
[production] |
16:31 |
<cgoubert@cumin1002> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw1437.eqiad.wmnet |
[production] |
16:31 |
<cgoubert@cumin1002> |
START - Cookbook sre.hosts.remove-downtime for mw1437.eqiad.wmnet |
[production] |
16:27 |
<cgoubert@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mw1437.eqiad.wmnet with reason: Resizing disk |
[production] |
16:27 |
<cgoubert@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on mw1437.eqiad.wmnet with reason: Resizing disk |
[production] |
16:23 |
<bvibber> |
running requeueTranscodes for missing audio files on commons (mwmaint1002) cf T368364 |
[production] |
16:23 |
<claime> |
depooling mw1437 |
[production] |
16:19 |
<claime> |
cleaning up shellbox leftover files on mw1437.eqiad.wmnet |
[production] |
16:19 |
<eevans@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Apply Cassandra upgrade to 4.1.5 — T354970 - eevans@cumin1002 |
[production] |
16:18 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1035 (re)pooling @ 75%: post T365986 repool', diff saved to https://phabricator.wikimedia.org/P65418 and previous config saved to /var/cache/conftool/dbconfig/20240625-161824-arnaudb.json |
[production] |
16:15 |
<claime> |
Extending vg-srv on mw1437 |
[production] |
16:10 |
<brennen@deploy1002> |
Finished deploy [phabricator/deployment@72ad841]: deploy phab1004 for T368392 - followup T364728 (duration: 00m 39s) |
[production] |
16:10 |
<brennen@deploy1002> |
Started deploy [phabricator/deployment@72ad841]: deploy phab1004 for T368392 - followup T364728 |
[production] |
16:09 |
<brennen@deploy1002> |
Finished deploy [phabricator/deployment@72ad841]: deploy phab2002 for T368392 - followup T364728 (duration: 00m 33s) |
[production] |
16:08 |
<brennen@deploy1002> |
Started deploy [phabricator/deployment@72ad841]: deploy phab2002 for T368392 - followup T364728 |
[production] |
16:05 |
<brennen> |
silencing phabricator hosts prior to deploy |
[production] |
16:03 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1035 (re)pooling @ 50%: post T365986 repool', diff saved to https://phabricator.wikimedia.org/P65417 and previous config saved to /var/cache/conftool/dbconfig/20240625-160318-arnaudb.json |
[production] |
15:33 |
<eevans@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Apply Cassandra upgrade to 4.1.5 — T354970 - eevans@cumin1002 |
[production] |
15:33 |
<eevans@cumin1002> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching aqs[1011-1021].eqiad.wmnet: Apply Cassandra upgrade to 4.1.5 — T354970 - eevans@cumin1002 |
[production] |
15:33 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1035 (re)pooling @ 10%: post T365986 repool', diff saved to https://phabricator.wikimedia.org/P65415 and previous config saved to /var/cache/conftool/dbconfig/20240625-153307-arnaudb.json |
[production] |
15:31 |
<Dreamy_Jazz> |
Ran `mwscript extensions/CheckUser/maintenance/deleteReadOldRowsInCuChanges.php --wiki=testwiki` for T366781 |
[production] |
15:22 |
<cgoubert@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply |
[production] |
15:21 |
<cgoubert@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply |
[production] |
15:21 |
<cgoubert@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply |
[production] |
15:20 |
<claime> |
Deploying statsd to mw-api-ext - T365265 |
[production] |
15:19 |
<cgoubert@deploy1002> |
helmfile [codfw] START helmfile.d/services/mw-api-ext: apply |
[production] |
15:18 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1035 (re)pooling @ 5%: post T365986 repool', diff saved to https://phabricator.wikimedia.org/P65414 and previous config saved to /var/cache/conftool/dbconfig/20240625-151802-arnaudb.json |
[production] |
15:06 |
<brennen@deploy1002> |
Finished deploy [phabricator/deployment@f58dd50]: deploy phab1004 for T368392 (duration: 00m 50s) |
[production] |
15:05 |
<brennen@deploy1002> |
Started deploy [phabricator/deployment@f58dd50]: deploy phab1004 for T368392 |
[production] |
15:05 |
<brennen@deploy1002> |
Finished deploy [phabricator/deployment@f58dd50]: deploy phab2002 for T368392 (duration: 00m 33s) |
[production] |
15:04 |
<brennen@deploy1002> |
Started deploy [phabricator/deployment@f58dd50]: deploy phab2002 for T368392 |
[production] |
15:03 |
<jelto@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update |
[production] |
15:03 |
<jelto@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator/Phorge update |
[production] |
15:03 |
<jelto@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update |
[production] |
15:02 |
<jelto@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator/Phorge update |
[production] |
15:00 |
<topranks> |
rebooting lsw1-e5-eqiad to upgrade JunOS on switch T365986 |
[production] |
14:58 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on 7 hosts with reason: JunOS upgrade lsw1-e5-eqiad |
[production] |
14:58 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:40:00 on 7 hosts with reason: JunOS upgrade lsw1-e5-eqiad |
[production] |
14:57 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on lsw1-e5-eqiad,lsw1-e5-eqiad IPv6,ssw1-e1-eqiad.mgmt,ssw1-f1-eqiad.mgmt with reason: JunOS upgrade lsw1-e5-eqiad |
[production] |
14:57 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:40:00 on lsw1-e5-eqiad,lsw1-e5-eqiad IPv6,ssw1-e1-eqiad.mgmt,ssw1-f1-eqiad.mgmt with reason: JunOS upgrade lsw1-e5-eqiad |
[production] |
14:56 |
<cdanis@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mw-web: apply |
[production] |
14:56 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:45:00 on es1035.eqiad.wmnet with reason: T365986 |
[production] |
14:56 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:45:00 on es1035.eqiad.wmnet with reason: T365986 |
[production] |