2024-01-29
ยง
|
13:26 |
<claime> |
Restarting ferm.service on k8s node kubernetes2055 - T354855 |
[production] |
13:25 |
<hnowlan@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage |
[production] |
13:23 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.reimage for host an-airflow1006.eqiad.wmnet with OS bullseye |
[production] |
13:23 |
<brouberol@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage |
[production] |
13:20 |
<hnowlan@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage |
[production] |
13:18 |
<hnowlan@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage |
[production] |
13:17 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2445.codfw.wmnet with reason: host reimage |
[production] |
13:16 |
<brouberol@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-tool1009.eqiad.wmnet with reason: host reimage |
[production] |
13:16 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2429.codfw.wmnet with reason: host reimage |
[production] |
13:16 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55797 and previous config saved to /var/cache/conftool/dbconfig/20240129-131623-marostegui.json |
[production] |
13:15 |
<hnowlan@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage |
[production] |
13:14 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2381.codfw.wmnet with reason: host reimage |
[production] |
13:13 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2355.codfw.wmnet with reason: host reimage |
[production] |
13:12 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2260.codfw.wmnet with reason: host reimage |
[production] |
13:07 |
<brouberol@cumin1002> |
START - Cookbook sre.hosts.reimage for host an-tool1009.eqiad.wmnet with OS bullseye |
[production] |
13:07 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2117 (T355609)', diff saved to https://phabricator.wikimedia.org/P55796 and previous config saved to /var/cache/conftool/dbconfig/20240129-130724-marostegui.json |
[production] |
13:07 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance |
[production] |
13:07 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance |
[production] |
13:06 |
<brouberol> |
I'm starting the reimaging process of an-tool1009.eqiad.wmnet, which will cause unavalability of hue.wikimedia.org while it runs - T349400 |
[analytics] |
13:06 |
<wmbot~taavi@runko> |
END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on toolsbeta-mail-2.toolsbeta.eqiad1.wikimedia.cloud |
[toolsbeta] |
13:04 |
<wmbot~taavi@runko> |
START - Cookbook wmcs.vps.refresh_puppet_certs on toolsbeta-mail-2.toolsbeta.eqiad1.wikimedia.cloud |
[toolsbeta] |
13:00 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2445.codfw.wmnet with OS bullseye |
[production] |
12:59 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2429.codfw.wmnet with OS bullseye |
[production] |
12:59 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance |
[production] |
12:58 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance |
[production] |
12:58 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2381.codfw.wmnet with OS bullseye |
[production] |
12:57 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance |
[production] |
12:57 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance |
[production] |
12:57 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55795 and previous config saved to /var/cache/conftool/dbconfig/20240129-125726-marostegui.json |
[production] |
12:57 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2355.codfw.wmnet with OS bullseye |
[production] |
12:56 |
<hnowlan@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw2260.codfw.wmnet with OS bullseye |
[production] |
12:54 |
<taavi@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) |
[admin] |
12:54 |
<taavi@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack |
[admin] |
12:46 |
<Rook> |
update jupyerlab T355890 |
[paws] |
12:42 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55794 and previous config saved to /var/cache/conftool/dbconfig/20240129-124220-marostegui.json |
[production] |
12:33 |
<moritzm> |
installing openssh security updates |
[production] |
12:27 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P55793 and previous config saved to /var/cache/conftool/dbconfig/20240129-122713-marostegui.json |
[production] |
12:25 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1007.eqiad.wmnet |
[production] |
12:21 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host an-airflow1007.eqiad.wmnet |
[production] |
12:14 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::airflow::wmde |
[production] |
12:12 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55792 and previous config saved to /var/cache/conftool/dbconfig/20240129-121205-marostegui.json |
[production] |
12:06 |
<wmbot~taavi@runko> |
END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a worker-nfs role in the tools cluster |
[tools] |
12:06 |
<wmbot~taavi@runko> |
Added a new k8s worker-nfs tools-k8s-worker-nfs-6.tools.eqiad1.wikimedia.cloud to the cluster |
[tools] |
12:06 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1231 (T355609)', diff saved to https://phabricator.wikimedia.org/P55791 and previous config saved to /var/cache/conftool/dbconfig/20240129-120628-marostegui.json |
[production] |
12:06 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance |
[production] |
12:06 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1231.eqiad.wmnet with reason: Maintenance |
[production] |
12:00 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::airflow::wmde |
[production] |
12:00 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance |
[production] |
11:59 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance |
[production] |
11:59 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1224 (T355609)', diff saved to https://phabricator.wikimedia.org/P55790 and previous config saved to /var/cache/conftool/dbconfig/20240129-115953-marostegui.json |
[production] |