7651-7700 of 10000 results (84ms)
2023-01-18 ยง
11:54 <volans> upgraded cumin on cumin1001 to 4.2.0-1+deb11u1 [production]
11:47 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on 10 hosts with reason: Still not ready to add these new presto servers to the cluster - btullis [production]
11:47 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on 10 hosts with reason: Still not ready to add these new presto servers to the cluster - btullis [production]
11:42 <jelto@cumin1001> START - Cookbook sre.gitlab.upgrade [production]
11:27 <jelto@cumin1001> END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) [production]
11:16 <volans@cumin1001> END (PASS) - Cookbook sre.network.cf (exit_code=0) [production]
11:16 <volans@cumin1001> START - Cookbook sre.network.cf [production]
11:15 <volans@cumin1001> END (PASS) - Cookbook sre.network.cf (exit_code=0) [production]
11:15 <volans@cumin1001> START - Cookbook sre.network.cf [production]
11:12 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1050.eqiad.wmnet with OS bullseye [production]
11:11 <volans@cumin2002> END (FAIL) - Cookbook sre.network.cf (exit_code=1) [production]
11:11 <volans@cumin2002> START - Cookbook sre.network.cf [production]
11:10 <volans@cumin1001> END (FAIL) - Cookbook sre.network.cf (exit_code=1) [production]
11:10 <volans@cumin1001> START - Cookbook sre.network.cf [production]
11:10 <volans@cumin1001> END (FAIL) - Cookbook sre.network.cf (exit_code=1) [production]
11:10 <volans@cumin1001> START - Cookbook sre.network.cf [production]
11:07 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1176 T326116', diff saved to https://phabricator.wikimedia.org/P43185 and previous config saved to /var/cache/conftool/dbconfig/20230118-110716-marostegui.json [production]
10:59 <volans@cumin1001> END (PASS) - Cookbook sre.network.cf (exit_code=0) [production]
10:59 <volans@cumin1001> START - Cookbook sre.network.cf [production]
10:57 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1050.eqiad.wmnet with reason: host reimage [production]
10:54 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc1050.eqiad.wmnet with reason: host reimage [production]
10:51 <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1176 to LB with just 1% weight T326116', diff saved to https://phabricator.wikimedia.org/P43184 and previous config saved to /var/cache/conftool/dbconfig/20230118-105106-marostegui.json [production]
10:49 <jelto@cumin1001> START - Cookbook sre.gitlab.upgrade [production]
10:48 <jelto@cumin1001> END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) [production]
10:43 <jiji@cumin1001> START - Cookbook sre.hosts.reimage for host mc1050.eqiad.wmnet with OS bullseye [production]
10:21 <zabe@deploy1002> Finished scap: Backport for [[gerrit:881361|Start reading from cuc_comment_id from a few wikis (T233004)]] (duration: 09m 17s) [production]
10:14 <zabe@deploy1002> zabe and zabe: Backport for [[gerrit:881361|Start reading from cuc_comment_id from a few wikis (T233004)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet [production]
10:12 <jelto@cumin1001> START - Cookbook sre.gitlab.upgrade [production]
10:12 <zabe@deploy1002> Started scap: Backport for [[gerrit:881361|Start reading from cuc_comment_id from a few wikis (T233004)]] [production]
09:51 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
09:51 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
09:49 <godog> start migration from webperf1004 to arclamp1001 - T319434 [production]
09:41 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet [production]
09:39 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet [production]
09:35 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet [production]
09:33 <jelto@cumin1001> END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) [production]
09:32 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet [production]
09:24 <jnuche@deploy1002> Synchronized php: group1 wikis to 1.40.0-wmf.19 refs T325582 (duration: 08m 20s) [production]
09:15 <jnuche@deploy1002> rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.19 refs T325582 [production]
08:54 <jelto@cumin1001> START - Cookbook sre.gitlab.upgrade [production]
08:34 <mvernon@cumin1001> conftool action : set/pooled=yes; selector: name=thanos-fe2002.codfw.wmnet [production]
08:34 <mvernon@cumin1001> conftool action : set/pooled=yes; selector: name=ms-fe2010.codfw.wmnet [production]
08:32 <mvernon@cumin1001> conftool action : set/pooled=true; selector: dnsdisc=thanos-query,name=codfw [production]
08:32 <mvernon@cumin1001> conftool action : set/pooled=true; selector: dnsdisc=thanos-swift,name=codfw [production]
08:32 <mvernon@cumin1001> conftool action : set/pooled=true; selector: dnsdisc=swift,name=codfw [production]
08:30 <jelto@cumin1001> END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) [production]
07:56 <jelto@cumin1001> START - Cookbook sre.gitlab.upgrade [production]
02:37 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2031.codfw.wmnet with reason: downtimed, host unreachable [production]
02:37 <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp2031.codfw.wmnet with reason: downtimed, host unreachable [production]
02:36 <sukhe@puppetmaster1001> conftool action : set/pooled=no; selector: name=cp2031.codfw.wmnet,service=ats-be [production]