201-250 of 10000 results (121ms)
2025-08-14 ยง
12:12 <moritzm> installing PHP 7.4 security updates [production]
12:03 <fceratto@cumin1002> END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2142.codfw.wmnet [production]
11:55 <fceratto@cumin1002> START - Cookbook sre.mysql.upgrade for db2142.codfw.wmnet [production]
11:45 <btullis@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-backup-datanode1001.eqiad.wmnet with OS bookworm [production]
11:45 <btullis@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003" [production]
11:41 <btullis@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003" [production]
11:33 <logmsgbot> mszabo Deployed security patch for T280413 [production]
11:28 <fceratto@cumin1002> END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for db2142.codfw.wmnet [production]
11:28 <fceratto@cumin1002> END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) db2142 - Upgrading db2142.codfw.wmnet [production]
11:28 <fceratto@cumin1002> START - Cookbook sre.mysql.depool db2142 - Upgrading db2142.codfw.wmnet [production]
11:27 <fceratto@cumin1002> START - Cookbook sre.mysql.upgrade for db2142.codfw.wmnet [production]
11:23 <btullis@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-backup-datanode1001.eqiad.wmnet with reason: host reimage [production]
11:23 <taavi> copy thanos package to trixie-wikimedia T401813 [production]
11:19 <btullis@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on an-backup-datanode1001.eqiad.wmnet with reason: host reimage [production]
11:13 <moritzm> installing openssl security updates [production]
11:08 <fceratto@cumin1002> END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) [production]
11:08 <fceratto@cumin1002> START - Cookbook sre.mysql.parsercache [production]
10:55 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db2219 (T399249)', diff saved to https://phabricator.wikimedia.org/P81350 and previous config saved to /var/cache/conftool/dbconfig/20250814-105514-fceratto.json [production]
10:55 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2219.codfw.wmnet with reason: Maintenance [production]
10:54 <btullis@cumin1003> START - Cookbook sre.hosts.reimage for host an-backup-datanode1001.eqiad.wmnet with OS bookworm [production]
10:49 <btullis@cumin1003> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-backup-datanode1001.eqiad.wmnet with OS bookworm [production]
10:37 <btullis@cumin1003> START - Cookbook sre.hosts.reimage for host an-backup-datanode1001.eqiad.wmnet with OS bookworm [production]
10:04 <btullis@cumin1003> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-backup-datanode1001.eqiad.wmnet with OS bookworm [production]
10:00 <mvernon@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1005.eqiad.wmnet with OS bullseye [production]
09:43 <mvernon@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1005.eqiad.wmnet with reason: host reimage [production]
09:43 <moritzm> installing Java 17 security updates [production]
09:41 <cgoubert@deploy1003> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
09:39 <cgoubert@deploy1003> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
09:39 <cgoubert@deploy1003> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
09:39 <mvernon@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1005.eqiad.wmnet with reason: host reimage [production]
09:38 <cgoubert@deploy1003> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
09:17 <mvernon@cumin1003> START - Cookbook sre.hosts.reimage for host thanos-be1005.eqiad.wmnet with OS bullseye [production]
09:07 <moritzm> installing Java 8 security updates on Bullseye [production]
08:59 <moritzm> uploaded openjdk-8 8u462-ga-1 to bullseye-wikimedia (backport of latest Java 8 security fixes) [production]
08:44 <jmm@cumin2002> END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad [production]
08:34 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production [production]
08:32 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production [production]
08:31 <XioNoX> lsw1-d2-codfw> restart jsd gracefully [production]
08:30 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging [production]
08:27 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging [production]
08:26 <jmm@cumin2002> START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad [production]
08:26 <moritzm> installing Java 8 security updates on kafka-test* [production]
07:50 <gkyziridis@deploy1003> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' . [production]
07:49 <gkyziridis@deploy1003> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' . [production]
07:19 <gkyziridis@deploy1003> Finished scap sync-world: Backport for [[gerrit:1177446|ores-extension: add threshold for revertrisk in enwiki (T400590)]] (duration: 12m 07s) [production]
07:14 <gkyziridis@deploy1003> gkyziridis, isaranto: Continuing with sync [production]
07:09 <ladsgroup@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [production]
07:09 <gkyziridis@deploy1003> gkyziridis, isaranto: Backport for [[gerrit:1177446|ores-extension: add threshold for revertrisk in enwiki (T400590)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
07:09 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1252 (T400854)', diff saved to https://phabricator.wikimedia.org/P81344 and previous config saved to /var/cache/conftool/dbconfig/20250814-070932-ladsgroup.json [production]
07:07 <gkyziridis@deploy1003> Started scap sync-world: Backport for [[gerrit:1177446|ores-extension: add threshold for revertrisk in enwiki (T400590)]] [production]