2025-05-21
ยง
|
13:50 |
<elukey@puppetserver1001> |
conftool action : set/pooled=yes:weight=1; selector: name=ml-serve1001.eqiad.wmnet,dc=eqiad,cluster=maps,service=inference |
[production] |
13:49 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS bullseye |
[production] |
13:49 |
<btullis@cumin1002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1018.eqiad.wmnet with OS bullseye |
[production] |
13:48 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet |
[production] |
13:48 |
<elukey@cumin1002> |
START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet |
[production] |
13:48 |
<sukhe@cumin1002> |
START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and not P{dns7001*} and A:dnsbox |
[production] |
13:47 |
<sukhe@cumin1002> |
START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough |
[production] |
13:43 |
<sukhe> |
updating dns-root-data on A:wikidough |
[production] |
13:42 |
<akosiaris@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply |
[production] |
13:42 |
<akosiaris@deploy1003> |
helmfile [codfw] START helmfile.d/services/eventgate-main: apply |
[production] |
13:41 |
<akosiaris@deploy1003> |
helmfile [staging] DONE helmfile.d/services/eventgate-main: apply |
[production] |
13:41 |
<akosiaris@deploy1003> |
helmfile [staging] START helmfile.d/services/eventgate-main: apply |
[production] |
13:41 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1001.eqiad.wmnet with OS bookworm |
[production] |
13:41 |
<sukhe> |
updating dns-root-data on A:dnsbox |
[production] |
13:40 |
<akosiaris@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply |
[production] |
13:39 |
<akosiaris@deploy1003> |
helmfile [eqiad] START helmfile.d/services/eventgate-main: apply |
[production] |
13:37 |
<akosiaris> |
deploy eventgate-main to pickup the CPU change as well as the change in envoy histogram buckets |
[production] |
13:37 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet |
[production] |
13:36 |
<vgutierrez> |
enabling edge uniques on cp4045 - T391411 |
[production] |
13:26 |
<reedy@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1148398|Revert^2 "extension-list: Add ConfirmEdit/hCaptcha/extension.json" (T382148 T394814)]] |
[production] |
13:25 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet |
[production] |
13:24 |
<reedy@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1148859|Stop setting $wgCaptchaClass in extension.json files (T394814)]], [[gerrit:1148860|Stop setting $wgCaptchaClass in extension.json files (T394814)]], [[gerrit:1148830|Add mediawiki.ForeignApi.core as a dependency (T387720)]] (duration: 10m 52s) |
[production] |
13:24 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage |
[production] |
13:21 |
<elukey@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage |
[production] |
13:17 |
<reedy@deploy1003> |
reedy, stran: Continuing with sync |
[production] |
13:17 |
<cmooney@cumin1003> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "import new switches from netbox to hiera now they are status active - cmooney@cumin1003 - T394021" |
[production] |
13:16 |
<cmooney@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "import new switches from netbox to hiera now they are status active - cmooney@cumin1003 - T394021" |
[production] |
13:15 |
<reedy@deploy1003> |
reedy, stran: Backport for [[gerrit:1148859|Stop setting $wgCaptchaClass in extension.json files (T394814)]], [[gerrit:1148860|Stop setting $wgCaptchaClass in extension.json files (T394814)]], [[gerrit:1148830|Add mediawiki.ForeignApi.core as a dependency (T387720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
13:13 |
<reedy@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1148859|Stop setting $wgCaptchaClass in extension.json files (T394814)]], [[gerrit:1148860|Stop setting $wgCaptchaClass in extension.json files (T394814)]], [[gerrit:1148830|Add mediawiki.ForeignApi.core as a dependency (T387720)]] |
[production] |
13:04 |
<elukey@cumin1002> |
START - Cookbook sre.hosts.reimage for host ml-serve1001.eqiad.wmnet with OS bookworm |
[production] |
12:50 |
<jmm@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mc-misc2002.codfw.wmnet |
[production] |
12:49 |
<topranks> |
test new core_out bgp policy on asw1-bw27-esams (T394530) |
[production] |
12:48 |
<pmiazga> |
Ran fixStuckGlobalRename.php for T394905 |
[production] |
12:34 |
<brouberol@cumin2002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1018.eqiad.wmnet with OS bullseye |
[production] |
12:30 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS bullseye |
[production] |
11:59 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2036 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P76367 and previous config saved to /var/cache/conftool/dbconfig/20250521-115950-root.json |
[production] |
11:56 |
<cmooney@dns2005> |
END - running authdns-update |
[production] |
11:55 |
<cmooney@dns2005> |
START - running authdns-update |
[production] |
11:44 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2036 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P76366 and previous config saved to /var/cache/conftool/dbconfig/20250521-114444-root.json |
[production] |
11:37 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
11:37 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: push IPv6 address changes for codfw expansion link networks - cmooney@cumin1002" |
[production] |
11:37 |
<cmooney@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: push IPv6 address changes for codfw expansion link networks - cmooney@cumin1002" |
[production] |
11:35 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-eqiad |
[production] |
11:33 |
<jmm@cumin2002> |
START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-eqiad |
[production] |
11:33 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet |
[production] |
11:33 |
<cmooney@cumin1002> |
START - Cookbook sre.dns.netbox |
[production] |
11:29 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2036 (re)pooling @ 60%: Repooling', diff saved to https://phabricator.wikimedia.org/P76365 and previous config saved to /var/cache/conftool/dbconfig/20250521-112939-root.json |
[production] |
11:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2036 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P76364 and previous config saved to /var/cache/conftool/dbconfig/20250521-111433-root.json |
[production] |
11:02 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet |
[production] |
11:01 |
<jgiannelos@deploy1003> |
helmfile [staging] DONE helmfile.d/services/mobileapps: sync |
[production] |