2024-04-24
ยง
|
15:00 |
<SandraEbele_> |
starting refinery deployment |
[production] |
14:57 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1199 (re)pooling @ 50%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61164 and previous config saved to /var/cache/conftool/dbconfig/20240424-145758-arnaudb.json |
[production] |
14:55 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1190 (re)pooling @ 5%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61163 and previous config saved to /var/cache/conftool/dbconfig/20240424-145545-arnaudb.json |
[production] |
14:55 |
<dancy@deploy1002> |
Installation of scap version "4.79.0" completed for 325 hosts |
[production] |
14:54 |
<dancy@deploy1002> |
Installing scap version "4.79.0" for 325 hosts |
[production] |
14:53 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1190.eqiad.wmnet with OS bookworm |
[production] |
14:52 |
<moritzm> |
installing exim4/spamassassin on MXes |
[production] |
14:45 |
<moritzm> |
installing php7.4 security updates (as shipped in Debian, not our internal component) |
[production] |
14:42 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1199 (re)pooling @ 25%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61162 and previous config saved to /var/cache/conftool/dbconfig/20240424-144252-arnaudb.json |
[production] |
14:38 |
<sukhe> |
rolling restart of haproxy, pdns-rec and ntp on A:dnsbox |
[production] |
14:32 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1190.eqiad.wmnet with reason: host reimage |
[production] |
14:29 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1242 (re)pooling @ 100%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61160 and previous config saved to /var/cache/conftool/dbconfig/20240424-142905-arnaudb.json |
[production] |
14:27 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1199 (re)pooling @ 10%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61159 and previous config saved to /var/cache/conftool/dbconfig/20240424-142747-arnaudb.json |
[production] |
14:27 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1190.eqiad.wmnet with reason: host reimage |
[production] |
14:26 |
<elukey@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
14:21 |
<sukhe@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org |
[production] |
14:20 |
<sukhe> |
restarting pdns-rec on dns6001 |
[production] |
14:19 |
<sukhe@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org |
[production] |
14:19 |
<moritzm> |
import djangorestframework 3.14.0-2+wmf12u1 to apt.wikimedia.org (bug fix needed for Bitu 0.7.0, https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1068747) |
[production] |
14:14 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.reimage for host db1190.eqiad.wmnet with OS bookworm |
[production] |
14:14 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1242 (re)pooling @ 75%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61158 and previous config saved to /var/cache/conftool/dbconfig/20240424-141400-arnaudb.json |
[production] |
14:13 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1190.eqiad.wmnet with reason: T362746 |
[production] |
14:13 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1190.eqiad.wmnet with reason: T362746 |
[production] |
14:13 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depool db1190', diff saved to https://phabricator.wikimedia.org/P61157 and previous config saved to /var/cache/conftool/dbconfig/20240424-141305-arnaudb.json |
[production] |
14:12 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1199 (re)pooling @ 5%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61156 and previous config saved to /var/cache/conftool/dbconfig/20240424-141241-arnaudb.json |
[production] |
14:11 |
<elukey@cumin1002> |
END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching A:restbase-codfw: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:59 |
<urbanecm@deploy1002> |
Finished scap: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] (duration: 14m 08s) |
[production] |
13:58 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1242 (re)pooling @ 50%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61155 and previous config saved to /var/cache/conftool/dbconfig/20240424-135854-arnaudb.json |
[production] |
13:56 |
<elukey@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:55 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1199.eqiad.wmnet with OS bookworm |
[production] |
13:49 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase2021.codfw.wmnet: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:48 |
<urbanecm@deploy1002> |
urbanecm and nmw03: Continuing with sync |
[production] |
13:48 |
<urbanecm@deploy1002> |
urbanecm and nmw03: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdeb |
[production] |
13:47 |
<sukhe@cumin1002> |
END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough |
[production] |
13:45 |
<urbanecm@deploy1002> |
Started scap: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] |
[production] |
13:44 |
<urbanecm@deploy1002> |
Sync cancelled. |
[production] |
13:44 |
<urbanecm@deploy1002> |
urbanecm and nmw03: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdeb |
[production] |
13:43 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1242 (re)pooling @ 25%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61153 and previous config saved to /var/cache/conftool/dbconfig/20240424-134349-arnaudb.json |
[production] |
13:41 |
<urbanecm@deploy1002> |
Started scap: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] |
[production] |
13:40 |
<elukey@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2021.codfw.wmnet: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:39 |
<urbanecm@deploy1002> |
Sync cancelled. |
[production] |
13:38 |
<elukey@cumin1002> |
END (ERROR) - Cookbook sre.cassandra.roll-restart (exit_code=97) for nodes matching restbase2021.codfw.wmnet: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:37 |
<cgoubert@deploy1002> |
helmfile [eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
13:37 |
<cgoubert@deploy1002> |
helmfile [eqiad] START helmfile.d/admin 'apply'. |
[production] |
13:37 |
<elukey@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2021.codfw.wmnet: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:36 |
<cgoubert@deploy1002> |
helmfile [codfw] DONE helmfile.d/admin 'apply'. |
[production] |
13:36 |
<cgoubert@deploy1002> |
helmfile [codfw] START helmfile.d/admin 'apply'. |
[production] |
13:35 |
<urbanecm@deploy1002> |
urbanecm and nmw03: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:34 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1199.eqiad.wmnet with reason: host reimage |
[production] |
13:34 |
<sukhe@cumin1002> |
END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum |
[production] |