601-650 of 10000 results (137ms)
2025-03-27 ยง
16:44 <brett@cumin2002> START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P{cp2028.codfw.wmnet} and A:cp [production]
16:44 <brett@cumin2002> START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P{cp2027.codfw.wmnet} and A:cp [production]
16:44 <ladsgroup@deploy1003> Finished scap sync-world: Backport for [[gerrit:1131768|LoginAttemptCounter: Add extra hardening for long period too]], [[gerrit:1131769|LoginAttemptCounter: Add extra hardening for long period too]] (duration: 16m 33s) [production]
16:43 <dcaro@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20 days, 0:00:00 on cloudcephosd1029.eqiad.wmnet with reason: Installing a disk for testing [production]
16:42 <jclark@cumin1002> START - Cookbook sre.hosts.provision for host elastic1124.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART [production]
16:39 <marostegui@cumin1002> END (ERROR) - Cookbook sre.mysql.clone (exit_code=97) of db1211.eqiad.wmnet onto db1255.eqiad.wmnet [production]
16:37 <jclark@cumin1002> START - Cookbook sre.hosts.reimage for host elastic1123.eqiad.wmnet with OS bullseye [production]
16:37 <ladsgroup@deploy1003> ladsgroup: Continuing with sync [production]
16:34 <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1131768|LoginAttemptCounter: Add extra hardening for long period too]], [[gerrit:1131769|LoginAttemptCounter: Add extra hardening for long period too]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
16:27 <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1131768|LoginAttemptCounter: Add extra hardening for long period too]], [[gerrit:1131769|LoginAttemptCounter: Add extra hardening for long period too]] [production]
16:25 <jgiannelos@deploy1003> Finished deploy [restbase/deploy@3349f02]: Deprecate unused RB codebase (duration: 19m 23s) [production]
16:24 <dancy@deploy1003> Sync cancelled. [production]
16:24 <dancy@deploy1003> dancy: Testing T389830 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
16:23 <dancy@deploy1003> Started scap sync-world: Testing T389830 [production]
16:22 <sergi0> Run `foreachwikiindblist growthexperiments CommunityConfiguration:migrateConfig CommunityUpdates 2.0.3`# T387737 [production]
16:17 <dancy@deploy1003> sync-world aborted: Testing T389830 (duration: 01m 48s) [production]
16:16 <dancy@deploy1003> Started scap sync-world: Testing T389830 [production]
16:14 <sergi0> Run `foreachwikiindblist growthexperiments CommunityConfiguration:setVersionData CommunityUpdates 2.0.2` # T387737 [production]
16:09 <jhancock@cumin2002> END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2331 [production]
16:09 <jhancock@cumin2002> START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2331 [production]
16:09 <jhancock@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:09 <jhancock@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2331 to codfw - jhancock@cumin2002" [production]
16:09 <jhancock@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2331 to codfw - jhancock@cumin2002" [production]
16:06 <jgiannelos@deploy1003> Started deploy [restbase/deploy@3349f02]: Deprecate unused RB codebase [production]
16:06 <marostegui@cumin1002> dbctl commit (dc=all): 'db2181 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P74476 and previous config saved to /var/cache/conftool/dbconfig/20250327-160623-root.json [production]
16:06 <dcausse@deploy1003> helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
16:05 <dcausse@deploy1003> helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply [production]
16:05 <jhancock@cumin2002> START - Cookbook sre.dns.netbox [production]
16:00 <elukey> `sudo systemctl restart burrow-jumbo-eqiad.service prometheus-burrow-exporter@jumbo-eqiad.service` on kafkamon1003 - attempt to check if the new kafka lag for benthos-webrequest_live is due to burrow - T390029 [production]
15:57 <ebernhardson@deploy1003> Finished scap sync-world: Backport for [[gerrit:1131359|Move cirrus traffic to eqiad for platform upgrade (T388610)]] (duration: 12m 49s) [production]
15:51 <marostegui@cumin1002> dbctl commit (dc=all): 'db2181 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P74475 and previous config saved to /var/cache/conftool/dbconfig/20250327-155117-root.json [production]
15:50 <ebernhardson@deploy1003> ebernhardson: Continuing with sync [production]
15:49 <ebernhardson@deploy1003> ebernhardson: Backport for [[gerrit:1131359|Move cirrus traffic to eqiad for platform upgrade (T388610)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
15:44 <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw [production]
15:44 <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw [production]
15:44 <ebernhardson@deploy1003> Started scap sync-world: Backport for [[gerrit:1131359|Move cirrus traffic to eqiad for platform upgrade (T388610)]] [production]
15:44 <otto@deploy1003> helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply [production]
15:44 <otto@deploy1003> helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply [production]
15:36 <marostegui@cumin1002> dbctl commit (dc=all): 'db2181 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P74474 and previous config saved to /var/cache/conftool/dbconfig/20250327-153612-root.json [production]
15:28 <hashar> Restarting Gerrit to raise heap from 32G to 64G (T387223) and to enable pushing notifications to browsers (T389327) [production]
15:28 <ottomata> upgrading eventgate-logging-external to node20 (using new per stream header enrich setting), first testing in staging. - T383814, T387908 [production]
15:26 <dcausse@deploy1003> helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
15:26 <dcausse@deploy1003> helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply [production]
15:21 <marostegui@cumin1002> dbctl commit (dc=all): 'db2181 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P74473 and previous config saved to /var/cache/conftool/dbconfig/20250327-152106-root.json [production]
15:19 <hashar@deploy1003> Finished scap sync-world: Sync patch to PrivateSettings.php and removal of unused configs (Gerrit: 1127930 1127889 1127890 1127886 1125095 1127900 1127898 1127887 1127897 1127888 1127929) (duration: 11m 52s) [production]
15:15 <elukey> update benthos@webrequest-live's config on centrallog nodes to new Kafka topics (haproxy vs varnishkafka) - T390029 [production]
15:07 <hashar@deploy1003> Started scap sync-world: Sync patch to PrivateSettings.php and removal of unused configs (Gerrit: 1127930 1127889 1127890 1127886 1125095 1127900 1127898 1127887 1127897 1127888 1127929) [production]
15:06 <hashar@deploy1003> sync-world aborted: Sync patch to PrivateSettings.php and removal of unused configs (Gerrit: 1127930 1127889 1127890 1127886 1125095 1127900 1127898 1127887 1127897 1127888 1127929) (duration: 00m 16s) [production]
15:06 <btullis@cumin1002> START - Cookbook sre.dns.netbox [production]
15:06 <hashar@deploy1003> Started scap sync-world: Sync patch to PrivateSettings.php and removal of unused configs (Gerrit: 1127930 1127889 1127890 1127886 1125095 1127900 1127898 1127887 1127897 1127888 1127929) [production]