2025-09-08
ยง
|
13:21 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts durum3004.esams.wmnet |
[production] |
13:20 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1207 (T401906)', diff saved to https://phabricator.wikimedia.org/P82710 and previous config saved to /var/cache/conftool/dbconfig/20250908-132044-fceratto.json |
[production] |
13:20 |
<sukhe@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4001.ulsfo.wmnet |
[production] |
13:19 |
<sukhe@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum2001.codfw.wmnet |
[production] |
13:19 |
<sukhe@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum1001.eqiad.wmnet |
[production] |
13:18 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db1207 (T401906)', diff saved to https://phabricator.wikimedia.org/P82709 and previous config saved to /var/cache/conftool/dbconfig/20250908-131818-fceratto.json |
[production] |
13:18 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance |
[production] |
13:17 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1200 (T401906)', diff saved to https://phabricator.wikimedia.org/P82708 and previous config saved to /var/cache/conftool/dbconfig/20250908-131755-fceratto.json |
[production] |
13:15 |
<sukhe@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host durum4001.ulsfo.wmnet |
[production] |
13:15 |
<sukhe@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host durum2001.codfw.wmnet |
[production] |
13:15 |
<sukhe@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host durum1001.eqiad.wmnet |
[production] |
13:14 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P82707 and previous config saved to /var/cache/conftool/dbconfig/20250908-131451-ladsgroup.json |
[production] |
13:02 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P82706 and previous config saved to /var/cache/conftool/dbconfig/20250908-130247-fceratto.json |
[production] |
12:59 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P82705 and previous config saved to /var/cache/conftool/dbconfig/20250908-125943-ladsgroup.json |
[production] |
12:51 |
<btullis@cumin1003> |
START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P{cephosd100*.eqiad.wmnet} and (A:cephosd) |
[production] |
12:47 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P82703 and previous config saved to /var/cache/conftool/dbconfig/20250908-124739-fceratto.json |
[production] |
12:44 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1186 (T402925)', diff saved to https://phabricator.wikimedia.org/P82702 and previous config saved to /var/cache/conftool/dbconfig/20250908-124436-ladsgroup.json |
[production] |
12:43 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply |
[production] |
12:42 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply |
[production] |
12:41 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
12:41 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
12:40 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh3005.wikimedia.org to drbd |
[production] |
12:32 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1200 (T401906)', diff saved to https://phabricator.wikimedia.org/P82701 and previous config saved to /var/cache/conftool/dbconfig/20250908-123232-fceratto.json |
[production] |
12:30 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.changedisk for changing disk type of doh3005.wikimedia.org to drbd |
[production] |
12:30 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db1200 (T401906)', diff saved to https://phabricator.wikimedia.org/P82700 and previous config saved to /var/cache/conftool/dbconfig/20250908-123007-fceratto.json |
[production] |
12:30 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1200.eqiad.wmnet with reason: Maintenance |
[production] |
12:29 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1185 (T401906)', diff saved to https://phabricator.wikimedia.org/P82699 and previous config saved to /var/cache/conftool/dbconfig/20250908-122952-fceratto.json |
[production] |
12:16 |
<btullis@cumin1003> |
END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto an-presto cluster: Roll restart of all Presto's jvm daemons. |
[production] |
12:16 |
<dcaro@cloudcumin1001> |
END (ERROR) - Cookbook wmcs.toolforge.run_tests (exit_code=97) |
[tools] |
12:14 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P82698 and previous config saved to /var/cache/conftool/dbconfig/20250908-121444-fceratto.json |
[production] |
12:12 |
<dcausse@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
12:12 |
<dcausse@deploy1003> |
helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
12:10 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Depooling db1186 (T402925)', diff saved to https://phabricator.wikimedia.org/P82697 and previous config saved to /var/cache/conftool/dbconfig/20250908-121000-ladsgroup.json |
[production] |
12:09 |
<ladsgroup@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance |
[production] |
12:09 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1169 (T402925)', diff saved to https://phabricator.wikimedia.org/P82696 and previous config saved to /var/cache/conftool/dbconfig/20250908-120937-ladsgroup.json |
[production] |
12:07 |
<dcaro@cloudcumin1001> |
START - Cookbook wmcs.toolforge.run_tests |
[tools] |
12:07 |
<dcaro@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.run_tests (exit_code=0) |
[tools] |
11:59 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P82695 and previous config saved to /var/cache/conftool/dbconfig/20250908-115937-fceratto.json |
[production] |
11:57 |
<btullis@cumin1003> |
END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch |
[production] |
11:55 |
<moritzm> |
Upgrading trixie installer image to 13.1 T403815 |
[production] |
11:54 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P82694 and previous config saved to /var/cache/conftool/dbconfig/20250908-115429-ladsgroup.json |
[production] |
11:49 |
<btullis@cumin1003> |
START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch |
[production] |
11:46 |
<dcaro@cloudcumin1001> |
START - Cookbook wmcs.toolforge.run_tests |
[tools] |
11:46 |
<dcaro@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.run_tests (exit_code=0) |
[tools] |
11:45 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of durum3005.esams.wmnet to drbd |
[production] |
11:45 |
<btullis@cumin1003> |
START - Cookbook sre.presto.roll-restart-workers for Presto an-presto cluster: Roll restart of all Presto's jvm daemons. |
[production] |
11:44 |
<topranks> |
restart netbox service on netbox-dev2003 (netbox-next) to update db from live server dump |
[production] |
11:44 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1185 (T401906)', diff saved to https://phabricator.wikimedia.org/P82693 and previous config saved to /var/cache/conftool/dbconfig/20250908-114429-fceratto.json |
[production] |
11:43 |
<btullis@cumin1003> |
START - Cookbook sre.hosts.reimage for host an-worker1233.eqiad.wmnet with OS bullseye |
[production] |
11:42 |
<btullis@cumin1003> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1233.eqiad.wmnet with OS bullseye |
[production] |