2025-10-14
ยง
|
17:19 |
<swfrench@deploy2002> |
Finished scap sync-world: Non-image-build scap run to scale 8.3 deployments - T405955 (duration: 05m 41s) |
[production] |
17:15 |
<swfrench@deploy2002> |
Started scap sync-world: Non-image-build scap run to scale 8.3 deployments - T405955 |
[production] |
16:55 |
<swfrench@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply |
[production] |
16:55 |
<swfrench@deploy2002> |
helmfile [codfw] START helmfile.d/services/rest-gateway: apply |
[production] |
16:43 |
<swfrench@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply |
[production] |
16:43 |
<swfrench@deploy2002> |
helmfile [eqiad] START helmfile.d/services/rest-gateway: apply |
[production] |
16:36 |
<swfrench@deploy2002> |
helmfile [staging] DONE helmfile.d/services/rest-gateway: apply |
[production] |
16:36 |
<swfrench@deploy2002> |
helmfile [staging] START helmfile.d/services/rest-gateway: apply |
[production] |
16:32 |
<swfrench@deploy2002> |
helmfile [staging] DONE helmfile.d/services/rest-gateway: apply |
[production] |
16:32 |
<swfrench@deploy2002> |
helmfile [staging] START helmfile.d/services/rest-gateway: apply |
[production] |
16:28 |
<swfrench@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/api-gateway: apply |
[production] |
16:27 |
<swfrench@deploy2002> |
helmfile [codfw] START helmfile.d/services/api-gateway: apply |
[production] |
16:27 |
<eevans@cumin1003> |
END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-codfw |
[production] |
16:21 |
<swfrench@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply |
[production] |
16:20 |
<swfrench@deploy2002> |
helmfile [eqiad] START helmfile.d/services/api-gateway: apply |
[production] |
16:19 |
<mutante> |
rebooting backend of releases.wikimedia.org |
[production] |
16:19 |
<dzahn@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases1003.eqiad.wmnet with reason: reboot |
[production] |
16:18 |
<fceratto@cumin1002> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host db-test1003.eqiad.wmnet |
[production] |
16:18 |
<fceratto@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db-test1003.eqiad.wmnet with OS trixie |
[production] |
16:17 |
<swfrench@deploy2002> |
helmfile [staging] DONE helmfile.d/services/api-gateway: apply |
[production] |
16:16 |
<swfrench@deploy2002> |
helmfile [staging] START helmfile.d/services/api-gateway: apply |
[production] |
16:12 |
<mutante> |
rebooting phab2002 |
[production] |
16:11 |
<dzahn@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2002.codfw.wmnet with reason: reboot |
[production] |
16:04 |
<fceratto@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db-test1003.eqiad.wmnet with reason: host reimage |
[production] |
16:03 |
<mutante> |
CI should be back in operation as normal |
[production] |
15:57 |
<mutante> |
rebooting main CI server - integration.wikimedia.org will be down for a minute |
[production] |
15:57 |
<fceratto@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db-test1003.eqiad.wmnet with reason: host reimage |
[production] |
15:56 |
<dzahn@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on contint1002.wikimedia.org with reason: reboot |
[production] |
15:50 |
<dzahn@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on contint2002.wikimedia.org with reason: reboot |
[production] |
15:50 |
<mutante> |
contint2002 - rebooting - (not the manager host) |
[production] |
15:47 |
<fceratto@cumin1002> |
START - Cookbook sre.hosts.reimage for host db-test1003.eqiad.wmnet with OS trixie |
[production] |
15:46 |
<swfrench-wmf> |
rolling run-puppet-agent on A:cp hosts - T405955 |
[production] |
15:33 |
<swfrench-wmf> |
disable-puppet on A:cp hosts - T405955 |
[production] |
15:30 |
<fceratto@cumin1002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test1003.eqiad.wmnet - fceratto@cumin1002" |
[production] |
15:30 |
<fceratto@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM db-test1003.eqiad.wmnet - fceratto@cumin1002" |
[production] |
15:30 |
<fceratto@cumin1002> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db-test1003.eqiad.wmnet on all recursors |
[production] |
15:30 |
<fceratto@cumin1002> |
START - Cookbook sre.dns.wipe-cache db-test1003.eqiad.wmnet on all recursors |
[production] |
15:30 |
<fceratto@cumin1002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:30 |
<fceratto@cumin1002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test1003.eqiad.wmnet - fceratto@cumin1002" |
[production] |
15:21 |
<fceratto@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM db-test1003.eqiad.wmnet - fceratto@cumin1002" |
[production] |
15:20 |
<moritzm> |
installing jq security updates |
[production] |
15:17 |
<herron@cumin1002> |
END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad |
[production] |
15:05 |
<fceratto@cumin1002> |
START - Cookbook sre.dns.netbox |
[production] |
15:05 |
<fceratto@cumin1002> |
START - Cookbook sre.ganeti.makevm for new host db-test1003.eqiad.wmnet |
[production] |
15:04 |
<brennen@deploy2002> |
Finished deploy [phabricator/deployment@16c9739]: deploy phab1004 for T407244 (duration: 00m 58s) |
[production] |
15:03 |
<brennen@deploy2002> |
Started deploy [phabricator/deployment@16c9739]: deploy phab1004 for T407244 |
[production] |
15:03 |
<brennen@deploy2002> |
Finished deploy [phabricator/deployment@16c9739]: deploy phab2002 for T407244 (duration: 00m 31s) |
[production] |
15:02 |
<brennen@deploy2002> |
Started deploy [phabricator/deployment@16c9739]: deploy phab2002 for T407244 |
[production] |
14:58 |
<arnaudb@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:45:00 on phab2002.codfw.wmnet,phab[1004-1005].eqiad.wmnet with reason: T407244 |
[production] |
14:51 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply |
[production] |