4251-4300 of 10000 results (84ms)
2024-03-20 §
13:55 <jiji@cumin1002> END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0) [production]
13:55 <jiji@cumin1002> START - Cookbook sre.switchdc.mediawiki.00-disable-puppet [production]
13:48 <jiji@deploy2002> Locking from deployment [ALL REPOSITORIES]: Datacenter Switchover - T357547 [production]
13:43 <akosiaris@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
13:42 <akosiaris@deploy1002> helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply [production]
13:42 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
13:42 <akosiaris> update chageprop-jobqueue to include rdb101{1,2} IPv6 related netpols [production]
13:41 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply [production]
13:32 <Lucas_WMDE> <moritzm> 13:16 UTC: installing libuv1 security updates on bullseye [re-log, original message wasn’t logged] [production]
13:20 <jayme> manually scaled up changeprop replicas in eqiad from 12 to 15 [production]
13:10 <logmsgbot> @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
13:10 <logmsgbot> @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply [production]
13:08 <moritzm> installing imagemagick security updates [production]
13:02 <jiji@deploy2002> helmfile [eqiad] DONE helmfile.d/services/thumbor: apply [production]
12:59 <slyngshede@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp-test2002.wikimedia.org with OS bookworm [production]
12:57 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply [production]
12:57 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply [production]
12:57 <moritzm> installing tiff security updates [production]
12:52 <jiji@deploy2002> helmfile [eqiad] START helmfile.d/services/thumbor: apply [production]
12:50 <jiji@deploy2002> helmfile [eqiad] DONE helmfile.d/services/thumbor: apply [production]
12:48 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply [production]
12:48 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply [production]
12:43 <claime> Depooled swift-rw from codfw [production]
11:24 <cgoubert@deploy2002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
11:23 <cgoubert@deploy2002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
11:22 <cgoubert@deploy2002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
11:22 <claime> deploying new namespace limits for changeprop [production]
11:22 <cgoubert@deploy2002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
11:10 <godog> bounce apache2 on logstash1031 - T337818 [production]
11:10 <cgoubert@deploy2002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
11:10 <cgoubert@deploy2002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
10:54 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/services/changeprop: apply [production]
10:51 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/services/changeprop: apply [production]
10:50 <brouberol> superset.wikimedia.org is now migrated to the DSE k8s cluster, CAS errors have receeded [production]
10:31 <claime> rolling back changeprop to previous version [production]
10:26 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002 [production]
10:25 <jmm@cumin2002> START - Cookbook sre.puppet.renew-cert for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002 [production]
10:22 <jmm@cumin2002> END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002 [production]
10:22 <jmm@cumin2002> START - Cookbook sre.puppet.renew-cert for idm-test1001.wikimedia.org: Renew puppet certificate - jmm@cumin2002 [production]
10:22 <brouberol> migrating superset to Kubernetes. Some CAS errors are expected during ~15 minutes [production]
10:17 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2003.codfw.wmnet with OS bookworm [production]
10:17 <cgoubert@deploy2002> helmfile [eqiad] DONE helmfile.d/services/changeprop: sync [production]
10:16 <cgoubert@deploy2002> helmfile [eqiad] START helmfile.d/services/changeprop: sync [production]
10:16 <claime> roll-restarting changeprop in eqiad [production]
10:11 <cgoubert@cumin2002> conftool action : set/weight=10:pooled=yes; selector: name=(mw1368.eqiad.wmnet|mw1369.eqiad.wmnet|mw1370.eqiad.wmnet|mw1478.eqiad.wmnet|mw1479.eqiad.wmnet),cluster=kubernetes,service=kubesvc [production]
10:08 <taavi> revoke labweb.discovery.wmnet cergen cert, migrated to cfssl [production]
10:07 <slyngshede@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp-test1003.wikimedia.org with OS bullseye [production]
10:03 <Lucas_WMDE> STOP persistRevisionThreadItems on viwiki for T315510, will restart after DC switch is done (resume at: --start '["17099868"]') [production]
10:01 <ayounsi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage [production]
09:59 <ayounsi@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage [production]