2023-10-26
ยง
|
13:50 |
<lucaswerkmeister-wmde@deploy2002> |
dcausse and lucaswerkmeister-wmde: Backport for [[gerrit:969064|cirrus: disable canary events for update & error streams]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:49 |
<lucaswerkmeister-wmde@deploy2002> |
Started scap: Backport for [[gerrit:969064|cirrus: disable canary events for update & error streams]] |
[production] |
13:46 |
<moritzm> |
installing cpio security updates |
[production] |
13:45 |
<lucaswerkmeister-wmde@deploy2002> |
Finished scap: Backport for [[gerrit:968777|CX3 Build 0.2.0+20231026 (T348563 T308836)]] (duration: 14m 48s) |
[production] |
13:40 |
<lucaswerkmeister-wmde@deploy2002> |
lucaswerkmeister-wmde and kartik: Continuing with sync |
[production] |
13:32 |
<lucaswerkmeister-wmde@deploy2002> |
lucaswerkmeister-wmde and kartik: Backport for [[gerrit:968777|CX3 Build 0.2.0+20231026 (T348563 T308836)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:32 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad |
[production] |
13:31 |
<moritzm> |
installing curl security updates on buster |
[production] |
13:31 |
<lucaswerkmeister-wmde@deploy2002> |
Started scap: Backport for [[gerrit:968777|CX3 Build 0.2.0+20231026 (T348563 T308836)]] |
[production] |
13:30 |
<lucaswerkmeister-wmde@deploy2002> |
Finished scap: Backport for [[gerrit:968737|Add throttle rule for Edit-a-Thon on 2023-11-03 (T349234)]] (duration: 06m 43s) |
[production] |
13:27 |
<jmm@cumin2002> |
START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad |
[production] |
13:26 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw |
[production] |
13:25 |
<lucaswerkmeister-wmde@deploy2002> |
zoranzoki21 and lucaswerkmeister-wmde: Continuing with sync |
[production] |
13:24 |
<lucaswerkmeister-wmde@deploy2002> |
zoranzoki21 and lucaswerkmeister-wmde: Backport for [[gerrit:968737|Add throttle rule for Edit-a-Thon on 2023-11-03 (T349234)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:23 |
<lucaswerkmeister-wmde@deploy2002> |
Started scap: Backport for [[gerrit:968737|Add throttle rule for Edit-a-Thon on 2023-11-03 (T349234)]] |
[production] |
13:21 |
<jmm@cumin2002> |
START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw |
[production] |
13:20 |
<lucaswerkmeister-wmde@deploy2002> |
Finished scap: Backport for [[gerrit:968713|Enable block feature for AbuseFilter on srwiki (T349727)]] (duration: 10m 23s) |
[production] |
13:20 |
<bking@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
13:20 |
<bking@cumin1001> |
END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) |
[production] |
13:15 |
<lucaswerkmeister-wmde@deploy2002> |
zoranzoki21 and lucaswerkmeister-wmde: Continuing with sync |
[production] |
13:15 |
<moritzm> |
installing poppler security updates |
[production] |
13:11 |
<lucaswerkmeister-wmde@deploy2002> |
zoranzoki21 and lucaswerkmeister-wmde: Backport for [[gerrit:968713|Enable block feature for AbuseFilter on srwiki (T349727)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:10 |
<lucaswerkmeister-wmde@deploy2002> |
Started scap: Backport for [[gerrit:968713|Enable block feature for AbuseFilter on srwiki (T349727)]] |
[production] |
13:04 |
<bking@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
12:27 |
<stevemunene@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-airflow1007.eqiad.wmnet with reason: Downtime as we setup the new WMDE Airflow instance |
[production] |
12:26 |
<stevemunene@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-airflow1007.eqiad.wmnet with reason: Downtime as we setup the new WMDE Airflow instance |
[production] |
11:04 |
<kevinbazira@deploy2002> |
helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
11:03 |
<kevinbazira@deploy2002> |
helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
10:58 |
<kevinbazira@deploy2002> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
10:51 |
<isaranto@deploy2002> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
10:51 |
<isaranto@deploy2002> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
10:51 |
<isaranto@deploy2002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
10:40 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
10:30 |
<mvolz@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/citoid: apply |
[production] |
10:29 |
<mvolz@deploy2002> |
helmfile [eqiad] START helmfile.d/services/citoid: apply |
[production] |
10:25 |
<mvolz@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/citoid: apply |
[production] |
10:25 |
<mvolz@deploy2002> |
helmfile [codfw] START helmfile.d/services/citoid: apply |
[production] |
10:20 |
<mvolz@deploy2002> |
helmfile [staging] DONE helmfile.d/services/citoid: apply |
[production] |
10:20 |
<mvolz@deploy2002> |
helmfile [staging] START helmfile.d/services/citoid: apply |
[production] |
10:10 |
<mvolz@deploy2002> |
helmfile [staging] DONE helmfile.d/services/citoid: apply |
[production] |
10:10 |
<mvolz@deploy2002> |
helmfile [staging] START helmfile.d/services/citoid: apply |
[production] |
09:29 |
<dcausse> |
erratum (replace wdqs1009 with wdqs2009 in the above msg): depooling and restarting blazegraph on wdqs2009 (stuck since 2023-10-12) |
[production] |
09:28 |
<dcausse> |
depooling and restarting blazegraph on wdqs1009 (stuck since 2023-10-12) |
[production] |
09:23 |
<brouberol@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1009.eqiad.wmnet with OS bullseye |
[production] |
09:14 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox |
[production] |
09:14 |
<ayounsi@cumin1001> |
START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox |
[production] |
09:06 |
<brouberol@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1009.eqiad.wmnet with reason: host reimage |
[production] |
09:03 |
<brouberol@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1009.eqiad.wmnet with reason: host reimage |
[production] |
08:50 |
<brouberol@cumin1001> |
START - Cookbook sre.hosts.reimage for host kafka-jumbo1009.eqiad.wmnet with OS bullseye |
[production] |
08:49 |
<urbanecm> |
mwmaint2002: `foreachwikiindblist /srv/mediawiki/dblists/growthexperiments.dblist extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=1second --verbose --use-job-queue` (testing T344428; after enabling backend on all Wikipedias) |
[production] |