2024-05-06
§
|
06:30 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db2102.codfw.wmnet with reason: Maintenance |
[production] |
06:30 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS bookworm |
[production] |
06:28 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db1193', diff saved to https://phabricator.wikimedia.org/P61871 and previous config saved to /var/cache/conftool/dbconfig/20240506-062814-root.json |
[production] |
06:17 |
<sfaci@deploy1002> |
helmfile [staging] DONE helmfile.d/services/editor-analytics: apply |
[production] |
06:17 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS bookworm |
[production] |
06:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2165 T363977', diff saved to https://phabricator.wikimedia.org/P61870 and previous config saved to /var/cache/conftool/dbconfig/20240506-061416-root.json |
[production] |
06:13 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Promote db2161 to s8 primary T363977', diff saved to https://phabricator.wikimedia.org/P61869 and previous config saved to /var/cache/conftool/dbconfig/20240506-061311-marostegui.json |
[production] |
06:12 |
<marostegui> |
Starting s8 codfw failover from db2165 to db2161 - T363977 |
[production] |
06:07 |
<sfaci@deploy1002> |
helmfile [staging] START helmfile.d/services/editor-analytics: apply |
[production] |
05:50 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T363977 |
[production] |
05:50 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set db2161 with weight 0 T363977', diff saved to https://phabricator.wikimedia.org/P61868 and previous config saved to /var/cache/conftool/dbconfig/20240506-055013-root.json |
[production] |
05:50 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T363977 |
[production] |
05:26 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2165.codfw.wmnet with reason: Maintenance |
[production] |
05:26 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db2165.codfw.wmnet with reason: Maintenance |
[production] |
2024-05-05
§
|
11:09 |
<brennen@deploy1002> |
Finished deploy [phabricator/deployment@dd53761]: test deploy phab1004 for T364271 (duration: 00m 32s) |
[production] |
11:08 |
<brennen@deploy1002> |
Started deploy [phabricator/deployment@dd53761]: test deploy phab1004 for T364271 |
[production] |
11:08 |
<brennen@deploy1002> |
Finished deploy [phabricator/deployment@dd53761]: test deploy phab2002 for T364271 (duration: 00m 32s) |
[production] |
11:07 |
<brennen@deploy1002> |
Started deploy [phabricator/deployment@dd53761]: test deploy phab2002 for T364271 |
[production] |
11:04 |
<taavi@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab.wmfusercontent.org with reason: brennen is deploying things |
[production] |
11:03 |
<taavi@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on phab.wmfusercontent.org with reason: brennen is deploying things |
[production] |
11:03 |
<taavi@cumin1002> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on phabricator.wikimedia.org with reason: brennen is deploying things |
[production] |
11:03 |
<taavi@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on phabricator.wikimedia.org with reason: brennen is deploying things |
[production] |
11:03 |
<taavi@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: brennen is deploying things |
[production] |
11:03 |
<taavi@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: brennen is deploying things |
[production] |
08:42 |
<taavi> |
taavi@gerrit1003 ~ $ sudo systemctl restart apache2 |
[production] |
2024-05-03
§
|
21:38 |
<ryankemper@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920 |
[production] |
21:38 |
<ryankemper@cumin2002> |
START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920 |
[production] |
21:27 |
<ryankemper> |
T362920 [wdqs] Depooled `wdqs2023` in preparation to switch it to a graph split host |
[production] |
19:02 |
<sukhe> |
cleaning up stale confd template files for magru related reimaging |
[production] |
18:44 |
<brett@cumin2002> |
conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx |
[production] |
18:43 |
<brett@cumin2002> |
conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx |
[production] |
18:38 |
<brett@cumin2002> |
conftool action : set/pooled=no; selector: name=ncredir7001.magru.wmnet,service=nginx |
[production] |
18:38 |
<brett@cumin2002> |
conftool action : set/pooled=no; selector: name=ncredir7002.magru.wmnet,service=nginx |
[production] |
18:29 |
<brett@cumin2002> |
conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx |
[production] |
18:29 |
<brett@cumin2002> |
conftool action : set/weight=1; selector: name=ncredir7002.magru.wmnet,service=nginx |
[production] |
18:29 |
<brett@cumin2002> |
conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx |
[production] |
18:28 |
<brett@cumin2002> |
conftool action : set/weight=1; selector: name=ncredir7001.magru.wmnet,service=nginx |
[production] |
17:45 |
<dcausse> |
repooling wdqs1012 |
[production] |
17:27 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance |
[production] |
17:27 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance |
[production] |
17:14 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir7002.magru.wmnet |
[production] |
17:14 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir7002.magru.wmnet with OS bookworm |
[production] |