1751-1800 of 10000 results (25ms)
2025-02-24 §
10:34 <hashar@deploy2002> Started deploy [integration/docroot@59d9e3f]: update links to microsites source code - T300171 [production]
10:12 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply [production]
10:12 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply [production]
10:07 <godog> set grafana thanos datasource interval to 30s - T371102 [production]
09:57 <cmooney@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on cr2-magru with reason: IBGP instability from cr1 to cr2 in magru causing ping faulures from alert1002 [production]
09:39 <brouberol@dns1004> END - running authdns-update [production]
09:38 <jynus@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2184.codfw.wmnet with reason: Upgrade and rebuild tables [production]
09:38 <brouberol@dns1004> START - running authdns-update [production]
09:37 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
09:37 <jynus@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2183.codfw.wmnet with reason: Upgrade and rebuild tables [production]
09:36 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
09:32 <urbanecm> Start GrowthExperiments:revalidateLinkRecommendations.php for frwiki, eswiki, ptwiki and idwiki (T385780) [production]
09:24 <XioNoX> cloudsw2-d5-eqiad> restart analytics-agent gracefully - T387018 [production]
09:12 <jynus@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2199.codfw.wmnet with reason: Upgrade and rebuild tables [production]
09:08 <urbanecm> Deployed security patch for T386963 [production]
08:52 <urbanecm@deploy2002> Finished scap sync-world: Backport for [[gerrit:1121600|revalidateLinkRecommendations: Initialize $allowedChecksums (T387001)]] (duration: 22m 02s) [production]
08:30 <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1121600|revalidateLinkRecommendations: Initialize $allowedChecksums (T387001)]] [production]
01:02 <jeena> Updating development images on contint primary for https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/73 [releng]
00:24 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [admin]
00:22 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [admin]
00:08 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan+apply for main branch [admin]
00:07 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [admin]
00:06 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for main branch [admin]
00:06 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.tofu running tofu plan for main branch [admin]
00:06 <andrew@cloudcumin1001> END (ERROR) - Cookbook wmcs.openstack.tofu (exit_code=97) running tofu plan for main branch [admin]
00:06 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.tofu running tofu plan for main branch [admin]
2025-02-23 §
22:00 <andrewbogott> testing [wikiqlever]
21:47 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [admin]
21:44 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.restart_openstack [admin]
11:49 <elukey@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dse-k8s-ctrl1002.eqiad.wmnet with reason: Avoid extra pages over the weekend [production]
11:47 <elukey@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dse-k8s-ctrl1001.eqiad.wmnet with reason: Avoid extra pages over the weekend [production]
11:39 <elukey> restart kube-apiserver on dse-k8s-ctrl1002 - unit up but errors in the logs [production]
11:29 <elukey> restart kube-apiserver on dse-k8s-ctrl1001 - errors in the logs but unit up and running [production]
2025-02-22 §
15:31 <wmbot~bsadowski1@tools-bastion-13> Restarted StewardBot/SULWatcher because of a connection loss [tools.stewardbots]
12:27 <wmbot~maurelio@tools-bastion-13> Update mabot to latest pywkikibot-core stable 62fbca0 [tools.mabot]
11:27 <taavi> rebooting integration-agent-docker-1047 which thinks it is gerrit [releng]
2025-02-21 §
22:57 <mutante> puppetmaster-1004 - apt-get remove --purge puppetserver; run-puppet-agent [devtools]
22:54 <brennen> gitlab: removing expiration date for 14 tokens expiring in 2025 (T385930) [releng]
22:36 <brennen> gitlab: set require_personal_access_token_expiry and service_access_tokens_expiration_enforced to false [releng]
22:31 <jhathaway@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be2088.codfw.wmnet with reason: T381919 [production]
21:15 <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in relforge [production]
21:15 <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Unbanning all hosts in relforge [production]
20:31 <jhathaway@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be2088.codfw.wmnet with reason: T381919 [production]
20:01 <jhancock@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2075.codfw.wmnet with OS bullseye [production]
20:00 <wmbot~lucaswerkmeister@tools-bastion-13> deployed 81611bc5dc (l10n updates: pa, tr) [tools.lexeme-forms]
19:17 <jhancock@cumin2002> START - Cookbook sre.hosts.reimage for host ms-be2075.codfw.wmnet with OS bullseye [production]
19:01 <jhancock@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2075.codfw.wmnet with OS bullseye [production]
18:51 <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: relforge1004* for test ability to ban opensearch node - bking@cumin2002 - T387030 [production]
18:51 <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Banning hosts: relforge1004* for test ability to ban opensearch node - bking@cumin2002 - T387030 [production]
18:34 <xcollazo> Deployed latest DAGs for the analytics Airflow instance. T387033. [analytics]