3251-3300 of 10000 results (107ms)
2025-06-13 ยง
14:39 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2198.codfw.wmnet with reason: Maintenance [production]
14:38 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2195 (T396130)', diff saved to https://phabricator.wikimedia.org/P77952 and previous config saved to /var/cache/conftool/dbconfig/20250613-143859-marostegui.json [production]
14:25 <mfossati@deploy1003> Finished deploy [airflow-dags/platform_eng@cab8d81]: hotfix-bump SEAL to v0.9.0 (duration: 02m 26s) [production]
14:23 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P77951 and previous config saved to /var/cache/conftool/dbconfig/20250613-142351-marostegui.json [production]
14:23 <mfossati@deploy1003> Started deploy [airflow-dags/platform_eng@cab8d81]: hotfix-bump SEAL to v0.9.0 [production]
14:17 <btullis@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply [production]
14:17 <btullis@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply [production]
14:08 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P77950 and previous config saved to /var/cache/conftool/dbconfig/20250613-140844-marostegui.json [production]
13:57 <damilare> SmashPig upgraded from 84c0668b to 4eef974d [production]
13:53 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2195 (T396130)', diff saved to https://phabricator.wikimedia.org/P77949 and previous config saved to /var/cache/conftool/dbconfig/20250613-135336-marostegui.json [production]
13:39 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2195 (T396130)', diff saved to https://phabricator.wikimedia.org/P77948 and previous config saved to /var/cache/conftool/dbconfig/20250613-133900-marostegui.json [production]
13:38 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2195.codfw.wmnet with reason: Maintenance [production]
13:38 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2181 (T396130)', diff saved to https://phabricator.wikimedia.org/P77947 and previous config saved to /var/cache/conftool/dbconfig/20250613-133837-marostegui.json [production]
13:23 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P77944 and previous config saved to /var/cache/conftool/dbconfig/20250613-132329-marostegui.json [production]
13:08 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P77942 and previous config saved to /var/cache/conftool/dbconfig/20250613-130822-marostegui.json [production]
13:05 <andrew@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1018.eqiad.wmnet with OS bullseye [production]
12:53 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2181 (T396130)', diff saved to https://phabricator.wikimedia.org/P77941 and previous config saved to /var/cache/conftool/dbconfig/20250613-125314-marostegui.json [production]
12:48 <andrew@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1018.eqiad.wmnet with reason: host reimage [production]
12:44 <andrew@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1018.eqiad.wmnet with reason: host reimage [production]
12:39 <marostegui@cumin1002> dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P77940 and previous config saved to /var/cache/conftool/dbconfig/20250613-123955-root.json [production]
12:36 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2181 (T396130)', diff saved to https://phabricator.wikimedia.org/P77939 and previous config saved to /var/cache/conftool/dbconfig/20250613-123635-marostegui.json [production]
12:36 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2181.codfw.wmnet with reason: Maintenance [production]
12:36 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2167 (T396130)', diff saved to https://phabricator.wikimedia.org/P77938 and previous config saved to /var/cache/conftool/dbconfig/20250613-123612-marostegui.json [production]
12:28 <andrew@cumin1002> START - Cookbook sre.hosts.reimage for host cloudcephosd1018.eqiad.wmnet with OS bullseye [production]
12:27 <andrew@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1018.eqiad.wmnet with OS bullseye [production]
12:24 <marostegui@cumin1002> dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P77937 and previous config saved to /var/cache/conftool/dbconfig/20250613-122449-root.json [production]
12:21 <akosiaris> T390251 re-enable puppet on all registries. [production]
12:21 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P77936 and previous config saved to /var/cache/conftool/dbconfig/20250613-122104-marostegui.json [production]
12:17 <andrew@cumin1002> START - Cookbook sre.hosts.reimage for host cloudcephosd1018.eqiad.wmnet with OS bullseye [production]
12:15 <andrew@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cloudcephosd1018.eqiad.wmnet [production]
12:15 <andrew@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1018.eqiad.wmnet [production]
12:09 <marostegui@cumin1002> dbctl commit (dc=all): 'db1182 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P77935 and previous config saved to /var/cache/conftool/dbconfig/20250613-120944-root.json [production]
12:05 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P77934 and previous config saved to /var/cache/conftool/dbconfig/20250613-120557-marostegui.json [production]
12:05 <andrew@cumin1002> START - Cookbook sre.hosts.reboot-single for host cloudcephosd1018.eqiad.wmnet [production]
12:02 <jmm@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir7004.magru.wmnet with OS bookworm [production]
11:55 <andrew@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cloudcephosd1018.eqiad.wmnet [production]
11:55 <andrew@cumin1002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1018.eqiad.wmnet'] [production]
11:54 <andrew@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1018.eqiad.wmnet'] [production]
11:54 <marostegui@cumin1002> dbctl commit (dc=all): 'db1182 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P77933 and previous config saved to /var/cache/conftool/dbconfig/20250613-115438-root.json [production]
11:54 <andrew@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1018.eqiad.wmnet'] [production]
11:50 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2167 (T396130)', diff saved to https://phabricator.wikimedia.org/P77932 and previous config saved to /var/cache/conftool/dbconfig/20250613-115049-marostegui.json [production]
11:49 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1182.eqiad.wmnet with reason: Maintenance [production]
11:49 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db1182', diff saved to https://phabricator.wikimedia.org/P77931 and previous config saved to /var/cache/conftool/dbconfig/20250613-114917-marostegui.json [production]
11:47 <andrew@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1018.eqiad.wmnet'] [production]
11:46 <jmm@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir7004.magru.wmnet with reason: host reimage [production]
11:45 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
11:45 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
11:43 <jmm@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir7004.magru.wmnet with reason: host reimage [production]
11:41 <akosiaris> T390251 re-enable puppet on registry1004 after merging puppet refactoring changes. [production]
11:34 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2167 (T396130)', diff saved to https://phabricator.wikimedia.org/P77930 and previous config saved to /var/cache/conftool/dbconfig/20250613-113402-marostegui.json [production]