1801-1850 of 10000 results (35ms)
2025-02-24 §
00:07 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.tofu running tofu plan+apply for main branch [admin]
00:06 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.openstack.tofu (exit_code=0) running tofu plan for main branch [admin]
00:06 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.tofu running tofu plan for main branch [admin]
00:06 <andrew@cloudcumin1001> END (ERROR) - Cookbook wmcs.openstack.tofu (exit_code=97) running tofu plan for main branch [admin]
00:06 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.tofu running tofu plan for main branch [admin]
2025-02-23 §
22:00 <andrewbogott> testing [wikiqlever]
21:47 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) [admin]
21:44 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.restart_openstack [admin]
11:49 <elukey@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dse-k8s-ctrl1002.eqiad.wmnet with reason: Avoid extra pages over the weekend [production]
11:47 <elukey@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dse-k8s-ctrl1001.eqiad.wmnet with reason: Avoid extra pages over the weekend [production]
11:39 <elukey> restart kube-apiserver on dse-k8s-ctrl1002 - unit up but errors in the logs [production]
11:29 <elukey> restart kube-apiserver on dse-k8s-ctrl1001 - errors in the logs but unit up and running [production]
2025-02-22 §
15:31 <wmbot~bsadowski1@tools-bastion-13> Restarted StewardBot/SULWatcher because of a connection loss [tools.stewardbots]
12:27 <wmbot~maurelio@tools-bastion-13> Update mabot to latest pywkikibot-core stable 62fbca0 [tools.mabot]
11:27 <taavi> rebooting integration-agent-docker-1047 which thinks it is gerrit [releng]
2025-02-21 §
22:57 <mutante> puppetmaster-1004 - apt-get remove --purge puppetserver; run-puppet-agent [devtools]
22:54 <brennen> gitlab: removing expiration date for 14 tokens expiring in 2025 (T385930) [releng]
22:36 <brennen> gitlab: set require_personal_access_token_expiry and service_access_tokens_expiration_enforced to false [releng]
22:31 <jhathaway@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be2088.codfw.wmnet with reason: T381919 [production]
21:15 <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in relforge [production]
21:15 <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Unbanning all hosts in relforge [production]
20:31 <jhathaway@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be2088.codfw.wmnet with reason: T381919 [production]
20:01 <jhancock@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2075.codfw.wmnet with OS bullseye [production]
20:00 <wmbot~lucaswerkmeister@tools-bastion-13> deployed 81611bc5dc (l10n updates: pa, tr) [tools.lexeme-forms]
19:17 <jhancock@cumin2002> START - Cookbook sre.hosts.reimage for host ms-be2075.codfw.wmnet with OS bullseye [production]
19:01 <jhancock@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2075.codfw.wmnet with OS bullseye [production]
18:51 <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: relforge1004* for test ability to ban opensearch node - bking@cumin2002 - T387030 [production]
18:51 <bking@cumin2002> START - Cookbook sre.elasticsearch.ban Banning hosts: relforge1004* for test ability to ban opensearch node - bking@cumin2002 - T387030 [production]
18:34 <xcollazo> Deployed latest DAGs for the analytics Airflow instance. T387033. [analytics]
18:34 <xcollazo@deploy2002> Finished deploy [airflow-dags/analytics@60223e2]: Deploying latest DAGs for the analytics Airflow instance. T387033. (duration: 00m 45s) [production]
18:33 <xcollazo@deploy2002> Started deploy [airflow-dags/analytics@60223e2]: Deploying latest DAGs for the analytics Airflow instance. T387033. [production]
18:11 <inflatador> bking@apt1002:~$ sudo -E reprepro --ignore=wrongdistribution -C component/opensearch13 include bullseye-wikimedia $HOME/madvise-pkg/opensearch-madvise_0.1_amd64.changes T387030 [production]
17:41 <jhancock@cumin2002> START - Cookbook sre.hosts.reimage for host ms-be2075.codfw.wmnet with OS bullseye [production]
17:25 <jhathaway@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be2088.codfw.wmnet with reason: T381919 [production]
17:13 <jhancock@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be2075'] [production]
17:12 <jhancock@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2075'] [production]
17:12 <jhancock@cumin2002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2075.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
16:47 <jhancock@cumin2002> START - Cookbook sre.hosts.provision for host ms-be2075.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
16:34 <Amir1> started replication on pc2017 and pc2015 (T387032) [production]
16:01 <jynus@cumin1002> dbctl commit (dc=all): 'Repool pc5 & pc7', diff saved to https://phabricator.wikimedia.org/P73504 and previous config saved to /var/cache/conftool/dbconfig/20250221-160158-jynus.json [production]
15:36 <jynus@cumin1002> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc[2015,2017].codfw.wmnet [production]
15:36 <jynus@cumin1002> START - Cookbook sre.hosts.remove-downtime for pc[2015,2017].codfw.wmnet [production]
15:27 <jynus> restarted (kill -9) mariadb @ pc2015,pc2017 T387032 [production]
15:19 <jynus> restarting mariadb @ pc2015 [production]
15:15 <jynus@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on pc[2015,2017].codfw.wmnet with reason: processes stuck [production]
14:51 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depool pc5', diff saved to https://phabricator.wikimedia.org/P73503 and previous config saved to /var/cache/conftool/dbconfig/20250221-145110-ladsgroup.json [production]
14:48 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depool pc7', diff saved to https://phabricator.wikimedia.org/P73502 and previous config saved to /var/cache/conftool/dbconfig/20250221-144852-ladsgroup.json [production]
14:21 <dcausse@deploy2002> helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
14:21 <dcausse@deploy2002> helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply [production]
14:18 <dcausse@deploy2002> helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]