2023-03-08
ยง
|
12:09 |
<jmm@cumin2002> |
END (ERROR) - Cookbook sre.ganeti.reimage (exit_code=97) for host urldownloader1003.wikimedia.org with OS bullseye |
[production] |
12:08 |
<hnowlan@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
12:04 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2157 (T329260)', diff saved to https://phabricator.wikimedia.org/P45478 and previous config saved to /var/cache/conftool/dbconfig/20230308-120406-marostegui.json |
[production] |
12:01 |
<claime> |
restbase-async back in standard state - T330651 |
[production] |
12:01 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1039.eqiad.wmnet with reason: host reimage |
[production] |
12:00 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool restbase-async in codfw: T330651 |
[production] |
11:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db2157 (T329260)', diff saved to https://phabricator.wikimedia.org/P45477 and previous config saved to /var/cache/conftool/dbconfig/20230308-115935-marostegui.json |
[production] |
11:59 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance |
[production] |
11:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db2146 (T328817)', diff saved to https://phabricator.wikimedia.org/P45476 and previous config saved to /var/cache/conftool/dbconfig/20230308-115924-marostegui.json |
[production] |
11:59 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance |
[production] |
11:59 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance |
[production] |
11:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45475 and previous config saved to /var/cache/conftool/dbconfig/20230308-115913-marostegui.json |
[production] |
11:59 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance |
[production] |
11:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145 (T328817)', diff saved to https://phabricator.wikimedia.org/P45474 and previous config saved to /var/cache/conftool/dbconfig/20230308-115903-marostegui.json |
[production] |
11:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T329203)', diff saved to https://phabricator.wikimedia.org/P45473 and previous config saved to /var/cache/conftool/dbconfig/20230308-115815-marostegui.json |
[production] |
11:57 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc1039.eqiad.wmnet with reason: host reimage |
[production] |
11:55 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase-async.discovery.wmnet on all recursors |
[production] |
11:55 |
<cgoubert@cumin1001> |
START - Cookbook sre.dns.wipe-cache restbase-async.discovery.wmnet on all recursors |
[production] |
11:55 |
<cgoubert@cumin1001> |
START - Cookbook sre.discovery.service-route depool restbase-async in codfw: T330651 |
[production] |
11:54 |
<claime> |
restbase-async pooled in eqiad, depooling in codfw- T330651 |
[production] |
11:54 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool restbase-async in eqiad: T330651 |
[production] |
11:52 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1113:3315', diff saved to https://phabricator.wikimedia.org/P45472 and previous config saved to /var/cache/conftool/dbconfig/20230308-115252-root.json |
[production] |
11:49 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase-async.discovery.wmnet on all recursors |
[production] |
11:49 |
<cgoubert@cumin1001> |
START - Cookbook sre.dns.wipe-cache restbase-async.discovery.wmnet on all recursors |
[production] |
11:49 |
<cgoubert@cumin1001> |
START - Cookbook sre.discovery.service-route pool restbase-async in eqiad: T330651 |
[production] |
11:49 |
<otto@deploy2002> |
Finished deploy [analytics/refinery@d4aaff9] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d4aaff9] (duration: 01m 30s) |
[production] |
11:48 |
<claime> |
Starting restbase-async switchback - T330651 |
[production] |
11:47 |
<otto@deploy2002> |
Started deploy [analytics/refinery@d4aaff9] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d4aaff9] |
[production] |
11:47 |
<otto@deploy2002> |
Finished deploy [analytics/refinery@d4aaff9] (thin): Regular analytics weekly train THIN [analytics/refinery@d4aaff9] (duration: 00m 07s) |
[production] |
11:47 |
<otto@deploy2002> |
Started deploy [analytics/refinery@d4aaff9] (thin): Regular analytics weekly train THIN [analytics/refinery@d4aaff9] |
[production] |
11:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db2137:3314 (T329203)', diff saved to https://phabricator.wikimedia.org/P45471 and previous config saved to /var/cache/conftool/dbconfig/20230308-114652-marostegui.json |
[production] |
11:46 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance |
[production] |
11:46 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance |
[production] |
11:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2136 (T329203)', diff saved to https://phabricator.wikimedia.org/P45470 and previous config saved to /var/cache/conftool/dbconfig/20230308-114642-marostegui.json |
[production] |
11:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1113:3315', diff saved to https://phabricator.wikimedia.org/P45469 and previous config saved to /var/cache/conftool/dbconfig/20230308-114553-root.json |
[production] |
11:44 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.reimage for host mc1039.eqiad.wmnet with OS bullseye |
[production] |
11:44 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P45468 and previous config saved to /var/cache/conftool/dbconfig/20230308-114407-marostegui.json |
[production] |
11:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P45467 and previous config saved to /var/cache/conftool/dbconfig/20230308-114357-marostegui.json |
[production] |
11:42 |
<otto@deploy2002> |
Finished deploy [analytics/refinery@d4aaff9]: Regular analytics weekly train [analytics/refinery@d4aaff9] (duration: 05m 09s) |
[production] |
11:37 |
<otto@deploy2002> |
Started deploy [analytics/refinery@d4aaff9]: Regular analytics weekly train [analytics/refinery@d4aaff9] |
[production] |
11:37 |
<otto@deploy2002> |
deploy aborted: Regular analytics weekly train [analytics/refinery@d4aaff9] (duration: 09m 38s) |
[production] |
11:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P45466 and previous config saved to /var/cache/conftool/dbconfig/20230308-113136-marostegui.json |
[production] |
11:29 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
11:29 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
11:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P45465 and previous config saved to /var/cache/conftool/dbconfig/20230308-112901-marostegui.json |
[production] |
11:28 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P45464 and previous config saved to /var/cache/conftool/dbconfig/20230308-112850-marostegui.json |
[production] |
11:27 |
<otto@deploy2002> |
Started deploy [analytics/refinery@d4aaff9]: Regular analytics weekly train [analytics/refinery@d4aaff9] |
[production] |
11:27 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
11:27 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
11:26 |
<akosiaris> |
T307943 upgrade kubernetes-client on deploy1002 deploy2002 |
[production] |