151-200 of 10000 results (103ms)
2025-05-16 ยง
13:54 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Remove db2166 and db1177 from x3 (T351820)', diff saved to https://phabricator.wikimedia.org/P76270 and previous config saved to /var/cache/conftool/dbconfig/20250516-135438-ladsgroup.json [production]
13:52 <fceratto@cumin1002> START - Cookbook sre.mysql.clone of db1238.eqiad.wmnet onto db1247.eqiad.wmnet [production]
13:50 <fceratto@cumin1002> END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1188 gradually with 4 steps - Pooling back in [production]
13:50 <fceratto@cumin1002> END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db1238.eqiad.wmnet onto db1247.eqiad.wmnet [production]
13:47 <fceratto@cumin1002> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1238 - Depool db1238.eqiad.wmnet to then clone it to db1247.eqiad.wmnet - fceratto@cumin1002 [production]
13:47 <fceratto@cumin1002> START - Cookbook sre.mysql.depool db1238 - Depool db1238.eqiad.wmnet to then clone it to db1247.eqiad.wmnet - fceratto@cumin1002 [production]
13:47 <fceratto@cumin1002> START - Cookbook sre.mysql.clone of db1238.eqiad.wmnet onto db1247.eqiad.wmnet [production]
13:40 <jclark@cumin1002> START - Cookbook sre.hosts.reimage for host an-worker1177.eqiad.wmnet with OS bullseye [production]
13:35 <jclark@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1177.eqiad.wmnet with OS bullseye [production]
13:21 <hashar@deploy1003> Finished deploy [gerrit/gerrit@fcb893c]: wm-zuul-status: do not popup when navigating changes - T394485 (duration: 00m 12s) [production]
13:21 <mvernon@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be1004.eqiad.wmnet with reason: host reimage [production]
13:21 <hashar@deploy1003> Started deploy [gerrit/gerrit@fcb893c]: wm-zuul-status: do not popup when navigating changes - T394485 [production]
13:17 <mvernon@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be1004.eqiad.wmnet with reason: host reimage [production]
13:13 <jclark@cumin1002> START - Cookbook sre.hosts.reimage for host an-worker1177.eqiad.wmnet with OS bullseye [production]
13:05 <fceratto@cumin1002> START - Cookbook sre.mysql.pool db1188 gradually with 4 steps - Pooling back in [production]
13:03 <joal@deploy1003> Finished deploy [airflow-dags/analytics@4ebb376]: Fix gobblin artifacts (after pulling code...) (duration: 01m 01s) [production]
13:02 <joal@deploy1003> Started deploy [airflow-dags/analytics@4ebb376]: Fix gobblin artifacts (after pulling code...) [production]
13:02 <joal@deploy1003> Finished deploy [airflow-dags/analytics_test@4ebb376]: Fix gobblin artifacts (duration: 00m 16s) [production]
13:01 <joal@deploy1003> Started deploy [airflow-dags/analytics_test@4ebb376]: Fix gobblin artifacts [production]
13:00 <joal@deploy1003> Finished deploy [airflow-dags/analytics@4351188]: Fix gobblin artifacts (duration: 00m 07s) [production]
13:00 <joal@deploy1003> Started deploy [airflow-dags/analytics@4351188]: Fix gobblin artifacts [production]
12:52 <mvernon@cumin1002> START - Cookbook sre.hosts.reimage for host apus-be1004.eqiad.wmnet with OS bookworm [production]
12:46 <mvernon@cumin1002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host apus-be1004.eqiad.wmnet with OS bookworm [production]
12:43 <aqu@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
12:42 <aqu@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
12:35 <fceratto@cumin1002> END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1188 gradually with 4 steps - Pooling back in [production]
12:33 <fceratto@cumin1002> START - Cookbook sre.mysql.pool db1188 gradually with 4 steps - Pooling back in [production]
12:32 <mvernon@cumin1002> START - Cookbook sre.hosts.reimage for host apus-be1004.eqiad.wmnet with OS bookworm [production]
12:28 <mvernon@cumin1002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host apus-be1004.eqiad.wmnet with OS bookworm [production]
12:20 <mvernon@cumin1002> START - Cookbook sre.hosts.reimage for host apus-be1004.eqiad.wmnet with OS bookworm [production]
12:01 <kamila@deploy1003> helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply [production]
12:01 <kamila@deploy1003> helmfile [eqiad] START helmfile.d/services/mw-cron: apply [production]
11:59 <fceratto@cumin1002> END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db1188.eqiad.wmnet onto db1246.eqiad.wmnet [production]
11:42 <fceratto@cumin1002> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1188 - Depool db1188.eqiad.wmnet to then clone it to db1246.eqiad.wmnet - fceratto@cumin1002 [production]
11:42 <fceratto@cumin1002> START - Cookbook sre.mysql.depool db1188 - Depool db1188.eqiad.wmnet to then clone it to db1246.eqiad.wmnet - fceratto@cumin1002 [production]
11:42 <fceratto@cumin1002> START - Cookbook sre.mysql.clone of db1188.eqiad.wmnet onto db1246.eqiad.wmnet [production]
11:23 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Remove db2242 from x3, remove db2154 from s8 (T351820)', diff saved to https://phabricator.wikimedia.org/P76262 and previous config saved to /var/cache/conftool/dbconfig/20250516-112345-ladsgroup.json [production]
11:19 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Remove db1214 from x3, remove db1257 from s8 (T351820)', diff saved to https://phabricator.wikimedia.org/P76261 and previous config saved to /var/cache/conftool/dbconfig/20250516-111952-ladsgroup.json [production]
10:44 <joal@deploy1003> Finished deploy [airflow-dags/analytics@4351188]: Deploying analytics with artifact-cache warming using main folder (duration: 00m 49s) [production]
10:43 <joal@deploy1003> Started deploy [airflow-dags/analytics@4351188]: Deploying analytics with artifact-cache warming using main folder [production]
10:28 <joal@deploy1003> Finished deploy [airflow-dags/main@4351188]: Deploying main instead of analytics subfolder (duration: 01m 51s) [production]
10:26 <joal@deploy1003> Started deploy [airflow-dags/main@4351188]: Deploying main instead of analytics subfolder [production]
10:22 <jynus> upgrading db1239 MariaDB server T394487 [production]
10:16 <jynus@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet,ms-backup1002.eqiad.wmnet with reason: Upgrade and test [production]
09:51 <joal@deploy1003> Finished deploy [airflow-dags/analytics_test@4351188]: Fix slf4j artifact sync (duration: 00m 12s) [production]
09:51 <joal@deploy1003> Started deploy [airflow-dags/analytics_test@4351188]: Fix slf4j artifact sync [production]
09:49 <btullis@deploy1003> Finished deploy [airflow-dags/analytics_test@c2d660e]: Test (duration: 24m 55s) [production]
09:27 <cgoubert@deploy1003> helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply [production]
09:27 <cgoubert@deploy1003> helmfile [eqiad] START helmfile.d/services/mw-cron: apply [production]
09:24 <btullis@deploy1003> Started deploy [airflow-dags/analytics_test@c2d660e]: Test [production]