2023-01-20
ยง
|
23:24 |
<andrewbogott> |
truncating logfiles with find . -name '*.err' -size +1G -exec truncate --size=100M {} \; |
[tools] |
21:24 |
<andrewbogott> |
truncating logfiles with find . -name '*.out' -size +1G -exec truncate --size=100M {} \; |
[tools] |
18:22 |
<jynus> |
deploying new grants for backups on m1 T327155 |
[production] |
16:15 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . |
[production] |
16:15 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . |
[production] |
16:15 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . |
[production] |
16:14 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . |
[production] |
16:14 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . |
[production] |
16:14 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . |
[production] |
16:14 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . |
[production] |
15:58 |
<wm-bot2> |
renewed kubeadm certs on toolsbeta-test-k8s-control-6 - cookbook ran by arturo@nostromo |
[toolsbeta] |
15:56 |
<wm-bot2> |
renewed kubeadm certs on toolsbeta-test-k8s-control-5 - cookbook ran by arturo@nostromo |
[toolsbeta] |
15:54 |
<wm-bot2> |
renewed kubeadm certs on toolsbeta-test-k8s-control-4 - cookbook ran by arturo@nostromo |
[toolsbeta] |
15:26 |
<wm-bot2> |
Removed cloudweb hosts (cloudweb2002-dev.wikimedia.org) from maintenance mode. - cookbook ran by andrew@bullseye |
[admin] |
15:26 |
<wm-bot2> |
Put cloudweb hosts (cloudweb2002-dev.wikimedia.org) into maintenance mode (downtime id: ['f47a3d91-b270-4c90-acc8-d85075a6bf8e'], use this to unset) - cookbook ran by andrew@bullseye |
[admin] |
14:28 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
14:27 |
<elukey@deploy1002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
14:24 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
14:24 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
13:15 |
<arturo> |
reinstall python3-neutron (to reset manual patching) on all cloudnet nodes and patch it via puppet, then restart neutron-l3-agent by hand (T327463) |
[admin] |
13:08 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new ping host - jmm@cumin2002" |
[production] |
13:08 |
<moritzm> |
installing node-minimatch security updates |
[production] |
13:01 |
<moritzm> |
installing libxstream-java security updates |
[production] |
13:00 |
<sukhe> |
reprepro --ignore=wrongdistribution -C main include bullseye-wikimedia cadvisor_0.44.0+ds1-1~wmf1_amd64.changes: T325557 |
[production] |
12:45 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new ping host - jmm@cumin2002" |
[production] |
12:38 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS bullseye |
[production] |
12:23 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2040.codfw.wmnet with reason: host reimage |
[production] |
12:20 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc2040.codfw.wmnet with reason: host reimage |
[production] |
12:17 |
<moritzm> |
installing ping1003 T273509 |
[production] |
12:04 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS bullseye |
[production] |
12:03 |
<jiji@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply |
[production] |
12:02 |
<jiji@deploy1002> |
helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply |
[production] |
10:50 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new ping host - jmm@cumin2002" |
[production] |
10:49 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new ping host - jmm@cumin2002" |
[production] |
10:32 |
<elukey> |
restart kubelet on ml-staging200* nodes (some fs-inotify-related issues with the istio-proxy of newly created containers) |
[production] |
10:27 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . |
[production] |
10:13 |
<moritzm> |
installing emacs security updates on bullseye |
[production] |
10:13 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . |
[production] |
10:12 |
<moritzm> |
imported jenkins 2.375-2 to thirdparty/ci T326531 |
[production] |
10:12 |
<arturo> |
[codfw1dev] failover neutron-l3-agent between cloudnet2005-dev/cloudnet2006-dev a couple of times T327463 |
[admin] |
10:00 |
<jnuche@deploy1002> |
Installation of scap version "4.33.1" completed for 1 hosts |
[production] |
10:00 |
<jnuche@deploy1002> |
Installing scap version "4.33.1" for 1 hosts |
[production] |
08:59 |
<moritzm> |
installing ping2003 T273509 |
[production] |
08:10 |
<elukey> |
restart kubelet on kubernetes2007 - node reported issues with it, marked as "notready" by the control plane |
[production] |
07:58 |
<elukey> |
`apt-get clean` on doh4001 to free space (root partition almost filled) |
[production] |
06:19 |
<hashar> |
Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/881665 | T327301 |
[releng] |
02:58 |
<AmandaNP> |
add WM Header debug manually into vendor files on server T325692 |
[utrs] |
02:17 |
<andrewbogott> |
stopping neutron-l3-agent on cloudnet1005 because it's logging at a furious rate and about to fill the drive |
[admin] |
01:55 |
<ejegg> |
payments-wiki upgraded from 3cf03933 to 3d882ac7 |
[production] |
01:12 |
<ejegg> |
payments-wiki upgraded from fcb9ab60 to 3cf03933 |
[production] |