51-100 of 10000 results (83ms)
2024-08-14 ยง
17:30 <sukhe@cumin1002> START - Cookbook sre.dns.admin DNS admin: show site None [reason: no reason specified, no task ID specified] [production]
17:17 <otto@deploy1003> Finished deploy [airflow-dags/analytics_product@6d50458]: (no justification provided) (duration: 00m 08s) [production]
17:17 <otto@deploy1003> Started deploy [airflow-dags/analytics_product@6d50458]: (no justification provided) [production]
17:16 <SandraEbele_> reran geoeditors_public_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [production]
17:13 <ladsgroup@deploy1003> Finished scap sync-world: Backport for [[gerrit:1062736|Avoid primary DB query for non-talk page edits (T370304)]], [[gerrit:1062737|Avoid primary DB query for non-talk page edits (T370304)]] (duration: 07m 54s) [production]
17:12 <SandraEbele_> reran geoeditors_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [production]
17:09 <SandraEbele_> reran geoeditors_edits_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [production]
17:08 <ladsgroup@deploy1003> ladsgroup: Continuing with sync [production]
17:07 <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1062736|Avoid primary DB query for non-talk page edits (T370304)]], [[gerrit:1062737|Avoid primary DB query for non-talk page edits (T370304)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
17:05 <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1062736|Avoid primary DB query for non-talk page edits (T370304)]], [[gerrit:1062737|Avoid primary DB query for non-talk page edits (T370304)]] [production]
16:59 <otto@deploy1003> Finished deploy [analytics/refinery@f033576]: Regular analytics weekly train [analytics/refinery@f0335766] (duration: 06m 48s) [production]
16:55 <SandraEbele_> reran unique_editors_by_country_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [production]
16:52 <SandraEbele_> reran edit_hourly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of down stream tasks after rerunning mediawiki_history_denormalize for 2024-06 snapshot. [production]
16:52 <otto@deploy1003> Started deploy [analytics/refinery@f033576]: Regular analytics weekly train [analytics/refinery@f0335766] [production]
16:52 <otto@deploy1003> Finished deploy [analytics/refinery@f033576] (thin): Regular analytics weekly train THIN [analytics/refinery@f0335766] (duration: 04m 13s) [production]
16:48 <SandraEbele_> reran editors_daily_monthly airflow dag with run_id scheduled__2024-06-01T00:00:00+00:00 as part of downstream tasks after rerunning mediawiki_history_denormalize dag [production]
16:48 <otto@deploy1003> Started deploy [analytics/refinery@f033576] (thin): Regular analytics weekly train THIN [analytics/refinery@f0335766] [production]
16:45 <otto@deploy1003> Finished deploy [analytics/refinery@f033576] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@f0335766] (duration: 03m 06s) [production]
16:45 <ladsgroup@deploy1003> ladsgroup: Continuing with sync [production]
16:43 <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1062736|Avoid primary DB query for non-talk page edits (T370304)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
16:42 <otto@deploy1003> Started deploy [analytics/refinery@f033576] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@f0335766] [production]
16:41 <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1062736|Avoid primary DB query for non-talk page edits (T370304)]] [production]
16:28 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 100%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67318 and previous config saved to /var/cache/conftool/dbconfig/20240814-162854-arnaudb.json [production]
16:24 <jayme@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2010.codfw.wmnet with OS bullseye [production]
16:13 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 75%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67317 and previous config saved to /var/cache/conftool/dbconfig/20240814-161350-arnaudb.json [production]
16:04 <klausman@deploy1003> helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'. [production]
16:04 <klausman@deploy1003> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'. [production]
16:03 <klausman@deploy1003> helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'. [production]
16:01 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye [production]
15:58 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 50%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67316 and previous config saved to /var/cache/conftool/dbconfig/20240814-155844-arnaudb.json [production]
15:48 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
15:47 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
15:43 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 25%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67315 and previous config saved to /var/cache/conftool/dbconfig/20240814-154338-arnaudb.json [production]
15:40 <dani@deploy1003> helmfile [codfw] DONE helmfile.d/services/miscweb: apply [production]
15:39 <dani@deploy1003> helmfile [codfw] START helmfile.d/services/miscweb: apply [production]
15:39 <dani@deploy1003> helmfile [eqiad] DONE helmfile.d/services/miscweb: apply [production]
15:39 <dani@deploy1003> helmfile [eqiad] START helmfile.d/services/miscweb: apply [production]
15:39 <dani@deploy1003> helmfile [staging] DONE helmfile.d/services/miscweb: apply [production]
15:39 <dani@deploy1003> helmfile [staging] START helmfile.d/services/miscweb: apply [production]
15:34 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye [production]
15:28 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 16%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67314 and previous config saved to /var/cache/conftool/dbconfig/20240814-152833-arnaudb.json [production]
15:13 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 8%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67312 and previous config saved to /var/cache/conftool/dbconfig/20240814-151328-arnaudb.json [production]
14:59 <klausman@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2010.codfw.wmnet [production]
14:58 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 4%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67307 and previous config saved to /var/cache/conftool/dbconfig/20240814-145819-arnaudb.json [production]
14:53 <klausman@cumin2002> START - Cookbook sre.hosts.reboot-single for host ml-serve2010.codfw.wmnet [production]
14:49 <jayme@cumin1002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2010.codfw.wmnet with OS bookworm [production]
14:43 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bookworm [production]
14:43 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 2%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67305 and previous config saved to /var/cache/conftool/dbconfig/20240814-144314-arnaudb.json [production]
14:32 <elukey@deploy1003> helmfile [eqiad] DONE helmfile.d/services/thumbor: sync [production]
14:28 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 1%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67304 and previous config saved to /var/cache/conftool/dbconfig/20240814-142808-arnaudb.json [production]