2024-08-07
ยง
|
16:02 |
<dcaro@cloudcumin1001> |
START - Cookbook wmcs.ceph.osd.undrain_node |
[admin] |
16:01 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Openjdk upgrade - elukey@cumin1002 |
[production] |
15:57 |
<andrew@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1038.eqiad.wmnet with reason: host reimage |
[production] |
15:54 |
<andrew@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1038.eqiad.wmnet with reason: host reimage |
[production] |
15:41 |
<wmbot~bd808@tools-bastion-12> |
Installed a hacked up version of mwclient from git+https://github.com/bd808/mwclient@2e9cd61d90738fbf9ee64ca2c1766a9095c24699 to work around T371977 |
[tools.stashbot] |
15:41 |
<dcaro@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=99) |
[admin] |
15:41 |
<dcaro@cloudcumin1001> |
START - Cookbook wmcs.ceph.osd.undrain_node |
[admin] |
15:40 |
<bd808> |
hello world |
[tools.stashbot] |
15:40 |
<brett> |
stop pybal on lvs2013 for server reboot |
[production] |
15:39 |
<bd808> |
hello world |
[tools.stashbot] |
15:39 |
<dcaro@cloudcumin1001> |
END (ERROR) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=97) |
[admin] |
15:39 |
<bd808> |
hello world |
[tools.stashbot] |
15:37 |
<andrew@cumin1002> |
START - Cookbook sre.hosts.reimage for host cloudcephosd1038.eqiad.wmnet with OS bullseye |
[production] |
15:36 |
<andrew@cumin1002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1038.eqiad.wmnet with OS bullseye |
[production] |
15:30 |
<dcaro@cloudcumin1001> |
START - Cookbook wmcs.ceph.osd.depool_and_destroy |
[admin] |
15:25 |
<kevinbazira@deploy1003> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' . |
[production] |
15:21 |
<andrew@cumin1002> |
START - Cookbook sre.hosts.reimage for host cloudcephosd1038.eqiad.wmnet with OS bullseye |
[production] |
15:15 |
<kevinbazira@deploy1003> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
15:12 |
<wmbot~dcaro@urcuchillay> |
START - Cookbook wmcs.ceph.wait_for_rebalance |
[admin] |
15:10 |
<hashar> |
deployment-prep: fix permissions on puppet server: sudo chown gitpuppet:gitpuppet /srv/git/operations/puppet/modules/karapace # T371982 |
[releng] |
15:06 |
<wmbot~dcaro@urcuchillay> |
END (FAIL) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=99) |
[admin] |
15:04 |
<wmbot~dcaro@urcuchillay> |
START - Cookbook wmcs.ceph.osd.undrain_node |
[admin] |
15:02 |
<wmbot~dcaro@urcuchillay> |
END (FAIL) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=99) |
[admin] |
15:02 |
<wmbot~dcaro@urcuchillay> |
START - Cookbook wmcs.ceph.osd.undrain_node |
[admin] |
14:58 |
<sukhe> |
start pybal on lvs3008 |
[production] |
14:53 |
<sukhe@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs3008.esams.wmnet |
[production] |
14:50 |
<sukhe@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host lvs3008.esams.wmnet |
[production] |
14:48 |
<bd808> |
hello world |
[tools.stashbot] |
14:46 |
<hashar> |
deployment-prep: removed git tags from 2021, 2022, 2023 in /srv/git/operations/puppet # T371982 |
[releng] |
14:33 |
<elukey@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Openjdk upgrade - elukey@cumin1002 |
[production] |
14:26 |
<jnuche@deploy1003> |
Finished deploy [releng/jenkins-deploy@9b733de] (releasing): (no justification provided) (duration: 01m 12s) |
[production] |
14:25 |
<jnuche@deploy1003> |
Started deploy [releng/jenkins-deploy@9b733de] (releasing): (no justification provided) |
[production] |
14:24 |
<sukhe> |
sudo cumin "lvs3008*" 'disable-puppet "rebooting" && systemctl stop pybal.service' |
[production] |
14:22 |
<jnuche@deploy1003> |
Finished deploy [releng/jenkins-deploy@9b733de] (releasing): (no justification provided) (duration: 00m 53s) |
[production] |
14:21 |
<jnuche@deploy1003> |
Started deploy [releng/jenkins-deploy@9b733de] (releasing): (no justification provided) |
[production] |
14:18 |
<wmbot~dcaro@urcuchillay> |
END (FAIL) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=99) (T363344) |
[admin] |
14:07 |
<wmbot~dcaro@urcuchillay> |
START - Cookbook wmcs.ceph.osd.bootstrap_and_add (T363344) |
[admin] |
14:05 |
<wmbot~dcaro@urcuchillay> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-15, tools-k8s-worker-nfs-42, tools-k8s-worker-nfs-8, tools-k8s-worker-nfs-12, tools-k8s-worker-nfs-21, tools-k8s-worker-nfs-38, tools-k8s-worker-nfs-47, tools-k8s-worker-nfs-55, tools-k8s-worker-nfs-43 |
[tools] |
14:04 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
14:03 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
14:01 |
<elukey> |
import Jenkins 2.462.1 on bullseye-wikimedia:thirdparty/ci |
[production] |
13:55 |
<sukhe> |
start pybal on lvs3009 |
[production] |
13:54 |
<sukhe@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs3009.esams.wmnet |
[production] |
13:51 |
<sukhe@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host lvs3009.esams.wmnet |
[production] |
13:46 |
<dcaro@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1037.eqiad.wmnet with OS bullseye |
[production] |
13:43 |
<hnowlan@deploy1003> |
Finished scap: sync to test mw-jobrunner resource increase (duration: 02m 22s) |
[production] |
13:42 |
<hnowlan@deploy1003> |
Started scap sync-world: sync to test mw-jobrunner resource increase |
[production] |
13:39 |
<filippo@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply |
[production] |
13:39 |
<filippo@deploy1003> |
helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply |
[production] |
13:39 |
<filippo@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply |
[production] |