2023-10-26
ยง
|
12:26 |
<stevemunene@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-airflow1007.eqiad.wmnet with reason: Downtime as we setup the new WMDE Airflow instance |
[production] |
11:04 |
<kevinbazira@deploy2002> |
helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
11:03 |
<kevinbazira@deploy2002> |
helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
10:58 |
<kevinbazira@deploy2002> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
10:51 |
<isaranto@deploy2002> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
10:51 |
<isaranto@deploy2002> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
10:51 |
<isaranto@deploy2002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
10:40 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
10:30 |
<mvolz@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/citoid: apply |
[production] |
10:29 |
<mvolz@deploy2002> |
helmfile [eqiad] START helmfile.d/services/citoid: apply |
[production] |
10:25 |
<mvolz@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/citoid: apply |
[production] |
10:25 |
<mvolz@deploy2002> |
helmfile [codfw] START helmfile.d/services/citoid: apply |
[production] |
10:20 |
<mvolz@deploy2002> |
helmfile [staging] DONE helmfile.d/services/citoid: apply |
[production] |
10:20 |
<mvolz@deploy2002> |
helmfile [staging] START helmfile.d/services/citoid: apply |
[production] |
10:10 |
<mvolz@deploy2002> |
helmfile [staging] DONE helmfile.d/services/citoid: apply |
[production] |
10:10 |
<mvolz@deploy2002> |
helmfile [staging] START helmfile.d/services/citoid: apply |
[production] |
09:29 |
<dcausse> |
erratum (replace wdqs1009 with wdqs2009 in the above msg): depooling and restarting blazegraph on wdqs2009 (stuck since 2023-10-12) |
[production] |
09:28 |
<dcausse> |
depooling and restarting blazegraph on wdqs1009 (stuck since 2023-10-12) |
[production] |
09:23 |
<brouberol@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1009.eqiad.wmnet with OS bullseye |
[production] |
09:14 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox |
[production] |
09:14 |
<ayounsi@cumin1001> |
START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox |
[production] |
09:06 |
<brouberol@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1009.eqiad.wmnet with reason: host reimage |
[production] |
09:03 |
<brouberol@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1009.eqiad.wmnet with reason: host reimage |
[production] |
08:50 |
<brouberol@cumin1001> |
START - Cookbook sre.hosts.reimage for host kafka-jumbo1009.eqiad.wmnet with OS bullseye |
[production] |
08:49 |
<urbanecm> |
mwmaint2002: `foreachwikiindblist /srv/mediawiki/dblists/growthexperiments.dblist extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=1second --verbose --use-job-queue` (testing T344428; after enabling backend on all Wikipedias) |
[production] |
08:48 |
<urbanecm@deploy2002> |
Finished scap: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] (duration: 09m 29s) |
[production] |
08:43 |
<urbanecm@deploy2002> |
urbanecm: Continuing with sync |
[production] |
08:40 |
<urbanecm@deploy2002> |
urbanecm: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
08:40 |
<kevinbazira@deploy2002> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
08:40 |
<brouberol@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1008.eqiad.wmnet with OS bullseye |
[production] |
08:39 |
<urbanecm@deploy2002> |
Started scap: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] |
[production] |
08:32 |
<kevinbazira@deploy2002> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
08:32 |
<urbanecm@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
08:31 |
<urbanecm@deploy2002> |
helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
08:29 |
<urbanecm@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
08:28 |
<urbanecm@deploy2002> |
helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
08:28 |
<urbanecm@deploy2002> |
helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
08:27 |
<urbanecm@deploy2002> |
helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
08:24 |
<brouberol@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1008.eqiad.wmnet with reason: host reimage |
[production] |
08:21 |
<brouberol@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1008.eqiad.wmnet with reason: host reimage |
[production] |
08:07 |
<brouberol@cumin1001> |
START - Cookbook sre.hosts.reimage for host kafka-jumbo1008.eqiad.wmnet with OS bullseye |
[production] |
08:02 |
<godog> |
restart prometheus k8s k8s-aux - T343529 |
[production] |
07:55 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15133 |
[production] |
07:54 |
<ayounsi@cumin1001> |
START - Cookbook sre.network.peering with action 'configure' for AS: 15133 |
[production] |
07:36 |
<jelto@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/miscweb: apply |
[production] |
07:32 |
<jelto@deploy2002> |
helmfile [codfw] START helmfile.d/services/miscweb: apply |
[production] |
07:31 |
<jelto@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/miscweb: apply |
[production] |
07:23 |
<jelto@deploy2002> |
helmfile [staging] START helmfile.d/services/miscweb: apply |
[production] |
07:21 |
<apergos> |
UTC morning backport and config window closed |
[production] |
07:19 |
<kartik@deploy2002> |
Finished scap: Backport for [[gerrit:968649|testwiki: Enable Section translation on some Wikipedias with potential to be supported with MinT (T345267)]] (duration: 13m 11s) |
[production] |