2025-09-21
§
|
19:58 |
<wmbot~jeanfred@tools-bastion-15> |
Reloaded SQL table configuration for 0fe6c07 (T346681) |
[tools.heritage] |
18:40 |
<ryankemper> |
T395772 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/1189979 to fix puppet failures on deploy servers |
[production] |
18:20 |
<ryankemper> |
[WDQS] Restarted `wdqs-blazegraph` on `wdqs2009` to restore service to https://query-legacy-full.wikidata.org/ |
[production] |
18:15 |
<ryankemper@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on wdqs[2009,2016].codfw.wmnet,wdqs[1018-1020].eqiad.wmnet with reason: T395772 |
[production] |
13:49 |
<wmbot~peterbowman@tools-bastion-14> |
Fix NKJP servlet and legacy SGE-era links, bump MySQL/J connector |
[tools.pbbot] |
09:17 |
<wmbot~dcaro@acme> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-37, tools-k8s-worker-nfs-2 |
[tools] |
09:02 |
<wmbot~dcaro@acme> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-37, tools-k8s-worker-nfs-2 |
[tools] |
03:16 |
<dcaro> |
acking and silencing CPU capacity alerts to handle on Monday, they should not page |
[tools] |
01:46 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=0) for a worker role in the tools cluster |
[tools] |
01:46 |
<andrew@cloudcumin1001> |
Added a new k8s worker tools-k8s-worker-113.tools.eqiad1.wikimedia.cloud to the cluster |
[tools] |
01:36 |
<andrewbogott> |
adding additional worker node in response to repeated capacity alerts |
[tools] |
01:35 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.add_k8s_node for a worker role in the tools cluster |
[tools] |
01:01 |
<mwpresync@deploy1003> |
Finished scap build-images: Publishing wmf/next image (duration: 01m 02s) |
[production] |
01:00 |
<mwpresync@deploy1003> |
Started scap build-images: Publishing wmf/next image |
[production] |
2025-09-19
§
|
21:38 |
<wmbot~jeanfred@tools-bastion-15> |
Load altered jobs.yml so that update-monuments runs on py39 |
[tools.heritage] |
21:36 |
<wmbot~jeanfred@tools-bastion-15> |
Recreate check-emailable-users job for WLM 2025 with Py39 image |
[tools.heritage] |
21:36 |
<wmbot~jeanfred@tools-bastion-15> |
Wiped out the py37 venv and recreated a py39 one |
[tools.heritage] |
18:35 |
<fceratto@deploy1003> |
helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . |
[production] |
18:07 |
<cmooney@cumin1003> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "remove sretest2009 - cmooney@cumin1003" |
[production] |
18:07 |
<cmooney@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "remove sretest2009 - cmooney@cumin1003" |
[production] |
17:59 |
<cmooney@cumin1003> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:57 |
<cmooney@cumin1003> |
START - Cookbook sre.dns.netbox |
[production] |
17:56 |
<cmooney@cumin1003> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts sretest2009.codfw.wmnet |
[production] |
17:56 |
<cmooney@cumin1003> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:56 |
<cmooney@cumin1003> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1003" |
[production] |
17:56 |
<cmooney@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1003" |
[production] |
17:51 |
<cmooney@cumin1003> |
START - Cookbook sre.dns.netbox |
[production] |
17:48 |
<cmooney@cumin1003> |
START - Cookbook sre.hosts.decommission for hosts sretest2009.codfw.wmnet |
[production] |
17:36 |
<cmooney@cumin1003> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "force sync to remove sretest2009 - cmooney@cumin1003" |
[production] |
17:34 |
<cmooney@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "force sync to remove sretest2009 - cmooney@cumin1003" |
[production] |
17:16 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Set s1 to RW', diff saved to https://phabricator.wikimedia.org/P83443 and previous config saved to /var/cache/conftool/dbconfig/20250919-171624-ladsgroup.json |
[production] |
17:12 |
<jhathaway@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie |
[production] |
17:12 |
<jhathaway@cumin2002> |
START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie |
[production] |
17:09 |
<cmooney@cumin1003> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS bookworm |
[production] |
17:04 |
<taavi@cumin1003> |
dbctl commit (dc=all): 'set s1 ro', diff saved to https://phabricator.wikimedia.org/P83441 and previous config saved to /var/cache/conftool/dbconfig/20250919-170402-taavi.json |
[production] |
17:02 |
<cmooney@cumin1003> |
START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS bookworm |
[production] |
16:56 |
<jhathaway@cumin2002> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART |
[production] |
16:54 |
<cmooney@cumin1003> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2009.codfw.wmnet with OS bookworm |
[production] |