201-250 of 10000 results (119ms)
2026-07-01 ยง
11:36 <mvolz@deploy1003> helmfile [eqiad] DONE helmfile.d/services/zotero: apply [production]
11:36 <mvolz@deploy1003> helmfile [eqiad] START helmfile.d/services/zotero: apply [production]
11:31 <cwilliams@cumin1003> START - Cookbook sre.mysql.pool pool db2240: Migration of db2240.codfw.wmnet completed [production]
11:30 <mvolz@deploy1003> helmfile [staging] DONE helmfile.d/services/zotero: apply [production]
11:28 <mvolz@deploy1003> helmfile [staging] START helmfile.d/services/zotero: apply [production]
11:27 <mvolz@deploy1003> helmfile [eqiad] DONE helmfile.d/services/citoid: apply [production]
11:27 <cmooney@cumin1003> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
11:27 <cmooney@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new link IP dns for trasnport circuits to magru - cmooney@cumin1003" [production]
11:27 <cmooney@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new link IP dns for trasnport circuits to magru - cmooney@cumin1003" [production]
11:27 <mvolz@deploy1003> helmfile [eqiad] START helmfile.d/services/citoid: apply [production]
11:23 <cmooney@cumin1003> START - Cookbook sre.dns.netbox [production]
11:21 <cwilliams@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2240.codfw.wmnet with OS trixie [production]
11:20 <cmooney@cumin1003> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
11:20 <cmooney@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new link IP dns for trasnport circuits to magru - cmooney@cumin1003" [production]
11:17 <cmooney@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new link IP dns for trasnport circuits to magru - cmooney@cumin1003" [production]
11:16 <mvolz@deploy1003> helmfile [codfw] DONE helmfile.d/services/citoid: apply [production]
11:16 <mvolz@deploy1003> helmfile [codfw] START helmfile.d/services/citoid: apply [production]
11:15 <atsuko@cumin2003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch2086.codfw.wmnet with OS trixie [production]
11:14 <atsuko@cumin2003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch2106.codfw.wmnet with OS trixie [production]
11:14 <mvolz@deploy1003> helmfile [staging] DONE helmfile.d/services/citoid: apply [production]
11:13 <mvolz@deploy1003> helmfile [staging] START helmfile.d/services/citoid: apply [production]
11:12 <cmooney@cumin1003> START - Cookbook sre.dns.netbox [production]
11:09 <atsuko@cumin2003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch2115.codfw.wmnet with OS trixie [production]
11:04 <cwilliams@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2240.codfw.wmnet with reason: host reimage [production]
11:00 <cwilliams@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on db2240.codfw.wmnet with reason: host reimage [production]
10:53 <atsuko@cumin2003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch2106.codfw.wmnet with reason: host reimage [production]
10:49 <atsuko@cumin2003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch2086.codfw.wmnet with reason: host reimage [production]
10:44 <atsuko@cumin2003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch2115.codfw.wmnet with reason: host reimage [production]
10:44 <cwilliams@cumin1003> START - Cookbook sre.hosts.reimage for host db2240.codfw.wmnet with OS trixie [production]
10:44 <atsuko@cumin2003> START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch2086.codfw.wmnet with reason: host reimage [production]
10:42 <atsuko@cumin2003> START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch2106.codfw.wmnet with reason: host reimage [production]
10:41 <cwilliams@cumin1003> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2240: Upgrading db2240.codfw.wmnet [production]
10:41 <cwilliams@cumin1003> START - Cookbook sre.mysql.depool depool db2240: Upgrading db2240.codfw.wmnet [production]
10:40 <cwilliams@cumin1003> START - Cookbook sre.mysql.major-upgrade [production]
10:40 <cwilliams@cumin1003> dbmaint on s4@codfw T429893 [production]
10:39 <atsuko@cumin2003> START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch2115.codfw.wmnet with reason: host reimage [production]
10:26 <cwilliams@cumin1003> dbctl commit (dc=all): 'Depool db2240 T430127', diff saved to https://phabricator.wikimedia.org/P94653 and previous config saved to /var/cache/conftool/dbconfig/20260701-102658-cwilliams.json [production]
10:26 <moritzm> installing nginx security updates [production]
10:26 <atsuko@cumin2003> START - Cookbook sre.hosts.reimage for host cirrussearch2086.codfw.wmnet with OS trixie [production]
10:23 <cwilliams@cumin1003> dbctl commit (dc=all): 'Promote db2179 to s4 primary T430127', diff saved to https://phabricator.wikimedia.org/P94652 and previous config saved to /var/cache/conftool/dbconfig/20260701-102356-cwilliams.json [production]
10:23 <cezmunsta> Starting s4 codfw failover from db2240 to db2179 - T430127 [production]
10:23 <atsuko@cumin2003> START - Cookbook sre.hosts.reimage for host cirrussearch2106.codfw.wmnet with OS trixie [production]
10:20 <atsuko@cumin2003> START - Cookbook sre.hosts.reimage for host cirrussearch2115.codfw.wmnet with OS trixie [production]
10:15 <cwilliams@cumin1003> dbctl commit (dc=all): 'Set db2179 with weight 0 T430127', diff saved to https://phabricator.wikimedia.org/P94651 and previous config saved to /var/cache/conftool/dbconfig/20260701-101531-cwilliams.json [production]
10:15 <cwilliams@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 T430127 [production]
09:56 <oblivian@cumin1003> END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix template (take 2) - oblivian@cumin1003" [production]
09:56 <oblivian@cumin1003> END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix template (take 2) - oblivian@cumin1003 [production]
09:55 <oblivian@cumin1003> START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix template (take 2) - oblivian@cumin1003 [production]
09:55 <oblivian@cumin1003> START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix template (take 2) - oblivian@cumin1003" [production]
09:51 <bwojtowicz@deploy1003> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' . [production]