1301-1350 of 10000 results (85ms)
2023-10-26 ยง
12:26 <stevemunene@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-airflow1007.eqiad.wmnet with reason: Downtime as we setup the new WMDE Airflow instance [production]
11:04 <kevinbazira@deploy2002> helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . [production]
11:03 <kevinbazira@deploy2002> helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . [production]
10:58 <kevinbazira@deploy2002> helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . [production]
10:51 <isaranto@deploy2002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
10:51 <isaranto@deploy2002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
10:51 <isaranto@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
10:40 <elukey@deploy2002> helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . [production]
10:30 <mvolz@deploy2002> helmfile [eqiad] DONE helmfile.d/services/citoid: apply [production]
10:29 <mvolz@deploy2002> helmfile [eqiad] START helmfile.d/services/citoid: apply [production]
10:25 <mvolz@deploy2002> helmfile [codfw] DONE helmfile.d/services/citoid: apply [production]
10:25 <mvolz@deploy2002> helmfile [codfw] START helmfile.d/services/citoid: apply [production]
10:20 <mvolz@deploy2002> helmfile [staging] DONE helmfile.d/services/citoid: apply [production]
10:20 <mvolz@deploy2002> helmfile [staging] START helmfile.d/services/citoid: apply [production]
10:10 <mvolz@deploy2002> helmfile [staging] DONE helmfile.d/services/citoid: apply [production]
10:10 <mvolz@deploy2002> helmfile [staging] START helmfile.d/services/citoid: apply [production]
09:29 <dcausse> erratum (replace wdqs1009 with wdqs2009 in the above msg): depooling and restarting blazegraph on wdqs2009 (stuck since 2023-10-12) [production]
09:28 <dcausse> depooling and restarting blazegraph on wdqs1009 (stuck since 2023-10-12) [production]
09:23 <brouberol@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1009.eqiad.wmnet with OS bullseye [production]
09:14 <ayounsi@cumin1001> END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox [production]
09:14 <ayounsi@cumin1001> START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox [production]
09:06 <brouberol@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1009.eqiad.wmnet with reason: host reimage [production]
09:03 <brouberol@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1009.eqiad.wmnet with reason: host reimage [production]
08:50 <brouberol@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-jumbo1009.eqiad.wmnet with OS bullseye [production]
08:49 <urbanecm> mwmaint2002: `foreachwikiindblist /srv/mediawiki/dblists/growthexperiments.dblist extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=1second --verbose --use-job-queue` (testing T344428; after enabling backend on all Wikipedias) [production]
08:48 <urbanecm@deploy2002> Finished scap: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] (duration: 09m 29s) [production]
08:43 <urbanecm@deploy2002> urbanecm: Continuing with sync [production]
08:40 <urbanecm@deploy2002> urbanecm: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
08:40 <kevinbazira@deploy2002> helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . [production]
08:40 <brouberol@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1008.eqiad.wmnet with OS bullseye [production]
08:39 <urbanecm@deploy2002> Started scap: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] [production]
08:32 <kevinbazira@deploy2002> helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . [production]
08:32 <urbanecm@deploy2002> helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
08:31 <urbanecm@deploy2002> helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply [production]
08:29 <urbanecm@deploy2002> helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
08:28 <urbanecm@deploy2002> helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply [production]
08:28 <urbanecm@deploy2002> helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
08:27 <urbanecm@deploy2002> helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply [production]
08:24 <brouberol@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1008.eqiad.wmnet with reason: host reimage [production]
08:21 <brouberol@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1008.eqiad.wmnet with reason: host reimage [production]
08:07 <brouberol@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-jumbo1008.eqiad.wmnet with OS bullseye [production]
08:02 <godog> restart prometheus k8s k8s-aux - T343529 [production]
07:55 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15133 [production]
07:54 <ayounsi@cumin1001> START - Cookbook sre.network.peering with action 'configure' for AS: 15133 [production]
07:36 <jelto@deploy2002> helmfile [codfw] DONE helmfile.d/services/miscweb: apply [production]
07:32 <jelto@deploy2002> helmfile [codfw] START helmfile.d/services/miscweb: apply [production]
07:31 <jelto@deploy2002> helmfile [eqiad] DONE helmfile.d/services/miscweb: apply [production]
07:23 <jelto@deploy2002> helmfile [staging] START helmfile.d/services/miscweb: apply [production]
07:21 <apergos> UTC morning backport and config window closed [production]
07:19 <kartik@deploy2002> Finished scap: Backport for [[gerrit:968649|testwiki: Enable Section translation on some Wikipedias with potential to be supported with MinT (T345267)]] (duration: 13m 11s) [production]