4501-4550 of 10000 results (96ms)
2024-04-23 ยง
14:14 <jmm@cumin2002> START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling restart_daemons on A:ncredir [production]
14:13 <effie> upload prometheus-memcached-exporter_0.14.2-2~wmf1_amd64 to bookworm-wikimedia - T350807 [production]
14:10 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2147 (re)pooling @ 25%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61107 and previous config saved to /var/cache/conftool/dbconfig/20240423-141045-arnaudb.json [production]
14:10 <jmm@cumin2002> END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling restart_daemons on A:ncredir-ulsfo [production]
14:09 <jmm@cumin2002> START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling restart_daemons on A:ncredir-ulsfo [production]
14:04 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2140.codfw.wmnet with reason: host reimage [production]
14:03 <zabe@deploy1002> zabe: Continuing with sync [production]
14:03 <zabe@deploy1002> zabe: Backport for [[gerrit:1022532|Update interwiki cache (T363093)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
14:01 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2140.codfw.wmnet with reason: host reimage [production]
14:00 <zabe@deploy1002> Started scap: Backport for [[gerrit:1022532|Update interwiki cache (T363093)]] [production]
13:55 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2147 (re)pooling @ 10%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61106 and previous config saved to /var/cache/conftool/dbconfig/20240423-135540-arnaudb.json [production]
13:43 <arnaudb@cumin1002> START - Cookbook sre.hosts.reimage for host db2140.codfw.wmnet with OS bookworm [production]
13:41 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2140.codfw.wmnet with reason: T362746 [production]
13:41 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2140.codfw.wmnet with reason: T362746 [production]
13:40 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2147 (re)pooling @ 5%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61105 and previous config saved to /var/cache/conftool/dbconfig/20240423-134034-arnaudb.json [production]
13:39 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2147.codfw.wmnet with OS bookworm [production]
13:38 <jgiannelos@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply [production]
13:37 <jgiannelos@deploy1002> helmfile [eqiad] START helmfile.d/services/mobileapps: apply [production]
13:36 <jgiannelos@deploy1002> helmfile [codfw] DONE helmfile.d/services/mobileapps: apply [production]
13:36 <jgiannelos@deploy1002> helmfile [codfw] START helmfile.d/services/mobileapps: apply [production]
13:35 <jgiannelos@deploy1002> helmfile [staging] DONE helmfile.d/services/mobileapps: apply [production]
13:35 <jgiannelos@deploy1002> helmfile [staging] START helmfile.d/services/mobileapps: apply [production]
13:35 <moritzm> installing glibc security updates [production]
13:34 <elukey@cumin1002> START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Deploy new TLS Keystore - PKI - elukey@cumin1002 [production]
13:26 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2155 (re)pooling @ 100%: Sanitarium master', diff saved to https://phabricator.wikimedia.org/P61103 and previous config saved to /var/cache/conftool/dbconfig/20240423-132633-arnaudb.json [production]
13:18 <sukhe> running authdns-update for T362921 [production]
13:17 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2147.codfw.wmnet with reason: host reimage [production]
13:15 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2147.codfw.wmnet with reason: host reimage [production]
13:11 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2155 (re)pooling @ 75%: Sanitarium master', diff saved to https://phabricator.wikimedia.org/P61102 and previous config saved to /var/cache/conftool/dbconfig/20240423-131128-arnaudb.json [production]
12:58 <arnaudb@cumin1002> START - Cookbook sre.hosts.reimage for host db2147.codfw.wmnet with OS bookworm [production]
12:57 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2147.codfw.wmnet with reason: T362746 [production]
12:57 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2147.codfw.wmnet with reason: T362746 [production]
12:57 <arnaudb@cumin1002> dbctl commit (dc=all): 'Depool db2147', diff saved to https://phabricator.wikimedia.org/P61101 and previous config saved to /var/cache/conftool/dbconfig/20240423-125703-arnaudb.json [production]
12:56 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2155 (re)pooling @ 50%: Sanitarium master', diff saved to https://phabricator.wikimedia.org/P61100 and previous config saved to /var/cache/conftool/dbconfig/20240423-125622-arnaudb.json [production]
12:55 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db2155.codfw.wmnet with reason: Reimage db2155 [production]
12:55 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 3:00:00 on db2155.codfw.wmnet with reason: Reimage db2155 [production]
12:54 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2155 depool', diff saved to https://phabricator.wikimedia.org/P61099 and previous config saved to /var/cache/conftool/dbconfig/20240423-125430-arnaudb.json [production]
12:45 <hashar@deploy1002> Finished deploy [gerrit/gerrit@ff51759]: Remove registerStyleModule() for Gerrit 3.8 - T354886 (duration: 00m 07s) [production]
12:17 <taavi@deploy1002> taavi: Continuing with sync [production]
12:17 <taavi@deploy1002> taavi: Backport for [[gerrit:1023046|Add cawiki 750k logo (T363057)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
12:13 <taavi@deploy1002> Started scap: Backport for [[gerrit:1023046|Add cawiki 750k logo (T363057)]] [production]
11:47 <cgoubert@cumin1002> conftool action : set/weight=10:pooled=yes; selector: name=(mw1414.eqiad.wmnet|mw1415.eqiad.wmnet|mw1416.eqiad.wmnet|mw1448.eqiad.wmnet|mw1449.eqiad.wmnet),cluster=kubernetes,service=kubesvc [production]
11:47 <claime> Pooling and uncordoning mw1414.eqiad.wmnet,mw1415.eqiad.wmnet,mw1416.eqiad.wmnet,mw1448.eqiad.wmnet,mw1449.eqiad.wmnet - T351074 [production]
11:39 <claime> Running homer 'cr*eqiad*' commit 'T351074' [production]
11:39 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Host has hardware issues [production]
11:38 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1246.eqiad.wmnet with reason: Host has hardware issues [production]
11:38 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1415.eqiad.wmnet with OS bullseye [production]
11:35 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1448.eqiad.wmnet with OS bullseye [production]
11:33 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1416.eqiad.wmnet with OS bullseye [production]
11:30 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1449.eqiad.wmnet with OS bullseye [production]