251-300 of 10000 results (104ms)
2025-11-19 ยง
17:14 <kamila@deploy2002> helmfile [codfw] DONE helmfile.d/services/mobileapps: sync [production]
17:14 <robh@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on dbstore1007.eqiad.wmnet with reason: C/D Migration [production]
17:13 <kamila@deploy2002> helmfile [codfw] START helmfile.d/services/mobileapps: sync [production]
17:12 <bking@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply [production]
17:12 <bking@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply [production]
17:10 <kamila@deploy2002> helmfile [codfw] DONE helmfile.d/services/mobileapps: apply [production]
17:10 <kamila@deploy2002> helmfile [codfw] START helmfile.d/services/mobileapps: apply [production]
17:10 <bking@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply [production]
17:09 <bking@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply [production]
17:08 <bking@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply [production]
17:08 <bking@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply [production]
17:08 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Testing all optimize (T410401)', diff saved to https://phabricator.wikimedia.org/P85394 and previous config saved to /var/cache/conftool/dbconfig/20251119-170814-ladsgroup.json [production]
17:05 <filippo@cumin1003> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie [production]
17:03 <robh@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on pc1014.eqiad.wmnet with reason: C/D Migration [production]
17:03 <sukhe@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts [production]
17:03 <sukhe@cumin1003> START - Cookbook sre.hosts.remove-downtime for 39 hosts [production]
17:01 <kamila@cumin1003> END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool site drmrs [reason: no reason specified, ] [production]
17:00 <filippo@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol2010-dev.codfw.wmnet with reason: host reimage [production]
17:00 <kamila@cumin1003> START - Cookbook sre.dns.admin DNS admin: pool site drmrs [reason: no reason specified, ] [production]
17:00 <andrew@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie [production]
17:00 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr[1-2]-drmrs IPv6,cr[1-2]-drmrs.mgmt [production]
17:00 <pt1979@cumin2002> START - Cookbook sre.hosts.remove-downtime for cr[1-2]-drmrs IPv6,cr[1-2]-drmrs.mgmt [production]
16:59 <andrew@cumin2002> START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie [production]
16:58 <andrew@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie [production]
16:58 <andrew@cumin2002> START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie [production]
16:57 <andrew@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1008-dev.eqiad.wmnet with reason: host reimage [production]
16:56 <filippo@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol2010-dev.codfw.wmnet with reason: host reimage [production]
16:52 <andrew@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1008-dev.eqiad.wmnet with reason: host reimage [production]
16:39 <sukhe@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 39 hosts with reason: site depool [production]
16:36 <filippo@cumin1003> START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS trixie [production]
16:35 <filippo@cumin1003> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol2010-dev.codfw.wmnet with OS trixie [production]
16:34 <andrew@cumin2002> START - Cookbook sre.hosts.reimage for host cloudcontrol1008-dev.eqiad.wmnet with OS bookworm [production]
16:29 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply [production]
16:29 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply [production]
16:28 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply [production]
16:27 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply [production]
16:27 <bking@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply [production]
16:27 <bking@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply [production]
16:27 <bking@deploy2002> helmfile [default] DONE helmfile.d/dse-k8s-services/opensearch-test: apply [production]
16:27 <bking@deploy2002> helmfile [default] START helmfile.d/dse-k8s-services/opensearch-test: apply [production]
16:06 <pt1979@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: router upgrade [production]
16:06 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply [production]
16:05 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply [production]
16:03 <moritzm> installing libvirt bugfix updates on trixie hosts [production]
15:59 <moritzm> installing brltty bugfix updates on trixie hosts [production]
15:53 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
15:52 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
15:52 <pt1979@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-b[12-13]-drmrs,cr[1-2]-drmrs,mr1-drmrs with reason: router upgrade [production]
15:48 <ladsgroup@deploy2002> Finished scap sync-world: Backport for [[gerrit:1207181|Revert "rdbms: Dismantle concept of groups"]] (duration: 09m 14s) [production]
15:44 <ladsgroup@deploy2002> trainbranchbot, ladsgroup: Continuing with sync [production]