2024-04-24
ยง
|
13:55 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1199.eqiad.wmnet with OS bookworm |
[production] |
13:49 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase2021.codfw.wmnet: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:48 |
<urbanecm@deploy1002> |
urbanecm and nmw03: Continuing with sync |
[production] |
13:48 |
<urbanecm@deploy1002> |
urbanecm and nmw03: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdeb |
[production] |
13:47 |
<sukhe@cumin1002> |
END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough and A:wikidough |
[production] |
13:45 |
<urbanecm@deploy1002> |
Started scap: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] |
[production] |
13:44 |
<urbanecm@deploy1002> |
Sync cancelled. |
[production] |
13:44 |
<urbanecm@deploy1002> |
urbanecm and nmw03: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdeb |
[production] |
13:43 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1242 (re)pooling @ 25%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61153 and previous config saved to /var/cache/conftool/dbconfig/20240424-134349-arnaudb.json |
[production] |
13:41 |
<urbanecm@deploy1002> |
Started scap: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]], [[gerrit:1023798|Revert "Updated uzwiktionary project namespace name and site name to follow" (T362620)]] |
[production] |
13:40 |
<elukey@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2021.codfw.wmnet: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:39 |
<urbanecm@deploy1002> |
Sync cancelled. |
[production] |
13:38 |
<elukey@cumin1002> |
END (ERROR) - Cookbook sre.cassandra.roll-restart (exit_code=97) for nodes matching restbase2021.codfw.wmnet: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:37 |
<cgoubert@deploy1002> |
helmfile [eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
13:37 |
<cgoubert@deploy1002> |
helmfile [eqiad] START helmfile.d/admin 'apply'. |
[production] |
13:37 |
<elukey@cumin1002> |
START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2021.codfw.wmnet: Deploy new TLS Truststore for PKI - elukey@cumin1002 |
[production] |
13:36 |
<cgoubert@deploy1002> |
helmfile [codfw] DONE helmfile.d/admin 'apply'. |
[production] |
13:36 |
<cgoubert@deploy1002> |
helmfile [codfw] START helmfile.d/admin 'apply'. |
[production] |
13:35 |
<urbanecm@deploy1002> |
urbanecm and nmw03: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:34 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1199.eqiad.wmnet with reason: host reimage |
[production] |
13:34 |
<sukhe@cumin1002> |
END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum |
[production] |
13:32 |
<urbanecm@deploy1002> |
Started scap: Backport for [[gerrit:1023530|Enabled subpages for main namespace in ptwikimedia (T362300)]], [[gerrit:1023531|Updated uzwiktionary project namespace name and site name to follow Uzbek grammar (T362620)]] |
[production] |
13:31 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1199.eqiad.wmnet with reason: host reimage |
[production] |
13:28 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1242 (re)pooling @ 10%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61152 and previous config saved to /var/cache/conftool/dbconfig/20240424-132841-arnaudb.json |
[production] |
13:24 |
<urbanecm@deploy1002> |
Finished scap: Backport for [[gerrit:1023101|Growth: Enable Levelling up features on all wikis (T348086)]], [[gerrit:1023796|WikiEduDashboard: allow removal when course is not synced (T363187)]], [[gerrit:1023797|WikiEduDashboard: allow removal when course is not synced (T363187)]] (duration: 20m 21s) |
[production] |
13:23 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors (exit_code=0) rolling restart_daemons on A:logstash-collector |
[production] |
13:18 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.reimage for host db1199.eqiad.wmnet with OS bookworm |
[production] |
13:17 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1199.eqiad.wmnet with reason: T362746 |
[production] |
13:17 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1199.eqiad.wmnet with reason: T362746 |
[production] |
13:17 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depool db1199', diff saved to https://phabricator.wikimedia.org/P61151 and previous config saved to /var/cache/conftool/dbconfig/20240424-131702-arnaudb.json |
[production] |
13:15 |
<jmm@cumin2002> |
START - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors rolling restart_daemons on A:logstash-collector |
[production] |
13:14 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.elasticsearch.restart-nginx (exit_code=0) rolling restart_daemons on A:elastic-eqiad |
[production] |
13:13 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db1242 (re)pooling @ 5%: Post reimage', diff saved to https://phabricator.wikimedia.org/P61150 and previous config saved to /var/cache/conftool/dbconfig/20240424-131336-arnaudb.json |
[production] |
13:12 |
<urbanecm@deploy1002> |
daimona and urbanecm: Continuing with sync |
[production] |
13:09 |
<sukhe@cumin1002> |
START - Cookbook sre.dns.roll-restart-reboot-durum rolling restart_daemons on A:durum |
[production] |
13:07 |
<urbanecm@deploy1002> |
daimona and urbanecm: Backport for [[gerrit:1023101|Growth: Enable Levelling up features on all wikis (T348086)]], [[gerrit:1023796|WikiEduDashboard: allow removal when course is not synced (T363187)]], [[gerrit:1023797|WikiEduDashboard: allow removal when course is not synced (T363187)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:04 |
<urbanecm@deploy1002> |
Started scap: Backport for [[gerrit:1023101|Growth: Enable Levelling up features on all wikis (T348086)]], [[gerrit:1023796|WikiEduDashboard: allow removal when course is not synced (T363187)]], [[gerrit:1023797|WikiEduDashboard: allow removal when course is not synced (T363187)]] |
[production] |
13:03 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1242.eqiad.wmnet with OS bookworm |
[production] |
12:59 |
<isaranto@deploy1002> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . |
[production] |
12:58 |
<isaranto@deploy1002> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . |
[production] |
12:57 |
<sukhe@cumin1002> |
START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough and A:wikidough |
[production] |
12:52 |
<jmm@cumin2002> |
START - Cookbook sre.elasticsearch.restart-nginx rolling restart_daemons on A:elastic-eqiad |
[production] |
12:26 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.reimage for host db1242.eqiad.wmnet with OS bookworm |
[production] |
12:25 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depool db1242', diff saved to https://phabricator.wikimedia.org/P61149 and previous config saved to /var/cache/conftool/dbconfig/20240424-122520-arnaudb.json |
[production] |
12:24 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1242.eqiad.wmnet with reason: T362746 |
[production] |
12:24 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1242.eqiad.wmnet with reason: T362746 |
[production] |
12:23 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on stat1010.eqiad.wmnet with reason: Connecting GPU power cable |
[production] |
12:23 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on stat1010.eqiad.wmnet with reason: Connecting GPU power cable |
[production] |
12:20 |
<jmm@cumin2002> |
START - Cookbook sre.elasticsearch.restart-nginx rolling restart_daemons on A:elastic-codfw |
[production] |
12:10 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.elasticsearch.restart-nginx (exit_code=0) rolling restart_daemons on A:elastic-canary |
[production] |