5751-5800 of 10000 results (99ms)
2024-03-20 ยง
12:48 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply [production]
12:43 <claime> Depooled swift-rw from codfw [production]
11:24 <cgoubert@deploy2002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
11:23 <cgoubert@deploy2002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
11:22 <cgoubert@deploy2002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
11:22 <claime> deploying new namespace limits for changeprop [production]
11:22 <cgoubert@deploy2002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
11:10 <godog> bounce apache2 on logstash1031 - T337818 [production]
11:10 <cgoubert@deploy2002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
11:10 <cgoubert@deploy2002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
10:54 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/services/changeprop: apply [production]
10:51 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/services/changeprop: apply [production]
10:50 <brouberol> superset.wikimedia.org is now migrated to the DSE k8s cluster, CAS errors have receeded [production]
10:31 <claime> rolling back changeprop to previous version [production]
10:26 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002 [production]
10:25 <jmm@cumin2002> START - Cookbook sre.puppet.renew-cert for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002 [production]
10:22 <jmm@cumin2002> END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002 [production]
10:22 <jmm@cumin2002> START - Cookbook sre.puppet.renew-cert for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002 [production]
10:22 <brouberol> migrating superset to Kubernetes. Some CAS errors are expected during ~15 minutes [production]
10:17 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2003.codfw.wmnet with OS bookworm [production]
10:17 <cgoubert@deploy2002> helmfile [eqiad] DONE helmfile.d/services/changeprop: sync [production]
10:16 <cgoubert@deploy2002> helmfile [eqiad] START helmfile.d/services/changeprop: sync [production]
10:16 <claime> roll-restarting changeprop in eqiad [production]
10:11 <cgoubert@cumin2002> conftool action : set/weight=10:pooled=yes; selector: name=(mw1368.eqiad.wmnet|mw1369.eqiad.wmnet|mw1370.eqiad.wmnet|mw1478.eqiad.wmnet|mw1479.eqiad.wmnet),cluster=kubernetes,service=kubesvc [production]
10:08 <taavi> revoke labweb.discovery.wmnet cergen cert, migrated to cfssl [production]
10:07 <slyngshede@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp-test1003.wikimedia.org with OS bullseye [production]
10:03 <Lucas_WMDE> STOP persistRevisionThreadItems on viwiki for T315510, will restart after DC switch is done (resume at: --start '["17099868"]') [production]
10:01 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage [production]
09:59 <ayounsi@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage [production]
09:57 <claime> running homer 'cr*eqiad*' commit 'T351074' [production]
09:56 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1369.eqiad.wmnet with OS bullseye [production]
09:54 <slyngshede@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage [production]
09:54 <ayounsi@cumin1002> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 200132 [production]
09:54 <ayounsi@cumin1002> START - Cookbook sre.network.peering with action 'configure' for AS: 200132 [production]
09:53 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1478.eqiad.wmnet with OS bullseye [production]
09:52 <slyngshede@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage [production]
09:52 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1479.eqiad.wmnet with OS bullseye [production]
09:50 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1370.eqiad.wmnet with OS bullseye [production]
09:48 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1368.eqiad.wmnet with OS bullseye [production]
09:43 <ayounsi@cumin1002> START - Cookbook sre.hosts.reimage for host sretest2003.codfw.wmnet with OS bookworm [production]
09:39 <slyngshede@cumin1002> START - Cookbook sre.hosts.reimage for host idp-test1003.wikimedia.org with OS bullseye [production]
09:37 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1369.eqiad.wmnet with reason: host reimage [production]
09:35 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1478.eqiad.wmnet with reason: host reimage [production]
09:32 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1479.eqiad.wmnet with reason: host reimage [production]
09:30 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1370.eqiad.wmnet with reason: host reimage [production]
09:28 <cgoubert@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1368.eqiad.wmnet with reason: host reimage [production]
09:27 <cgoubert@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1370.eqiad.wmnet with reason: host reimage [production]
09:26 <cgoubert@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1479.eqiad.wmnet with reason: host reimage [production]
09:26 <cgoubert@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1369.eqiad.wmnet with reason: host reimage [production]
09:26 <cgoubert@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1478.eqiad.wmnet with reason: host reimage [production]