901-950 of 10000 results (76ms)
2023-02-09 ยง
11:52 <jiji@cumin1001> START - Cookbook sre.hosts.reimage for host mc-gp1001.eqiad.wmnet with OS bullseye [production]
11:52 <jiji@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc-gp1001.eqiad.wmnet with OS bullseye [production]
11:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P43989 and previous config saved to /var/cache/conftool/dbconfig/20230209-114632-marostegui.json [production]
11:44 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P43988 and previous config saved to /var/cache/conftool/dbconfig/20230209-114434-marostegui.json [production]
11:40 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host puppetdb1003.eqiad.wmnet with OS bullseye [production]
11:34 <marostegui> Stop mariadb on db1098 (s6 and s7) T329171 [production]
11:31 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T328817)', diff saved to https://phabricator.wikimedia.org/P43986 and previous config saved to /var/cache/conftool/dbconfig/20230209-113125-marostegui.json [production]
11:31 <eoghan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner1003.eqiad.wmnet with OS bullseye [production]
11:29 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2117 (T329203)', diff saved to https://phabricator.wikimedia.org/P43985 and previous config saved to /var/cache/conftool/dbconfig/20230209-112927-marostegui.json [production]
11:27 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db2171:3315 (T328817)', diff saved to https://phabricator.wikimedia.org/P43984 and previous config saved to /var/cache/conftool/dbconfig/20230209-112748-marostegui.json [production]
11:27 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance [production]
11:27 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance [production]
11:27 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2157 (T328817)', diff saved to https://phabricator.wikimedia.org/P43983 and previous config saved to /var/cache/conftool/dbconfig/20230209-112727-marostegui.json [production]
11:24 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db2117 (T329203)', diff saved to https://phabricator.wikimedia.org/P43982 and previous config saved to /var/cache/conftool/dbconfig/20230209-112359-marostegui.json [production]
11:23 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance [production]
11:23 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance [production]
11:23 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2114 (T329203)', diff saved to https://phabricator.wikimedia.org/P43981 and previous config saved to /var/cache/conftool/dbconfig/20230209-112338-marostegui.json [production]
11:20 <jiji@cumin1001> START - Cookbook sre.hosts.reimage for host mc-gp1001.eqiad.wmnet with OS bullseye [production]
11:20 <jiji@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc-gp1001.eqiad.wmnet with OS bullseye [production]
11:12 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P43980 and previous config saved to /var/cache/conftool/dbconfig/20230209-111220-marostegui.json [production]
11:10 <eoghan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner1003.eqiad.wmnet with reason: host reimage [production]
11:08 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS bullseye [production]
11:08 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P43979 and previous config saved to /var/cache/conftool/dbconfig/20230209-110832-marostegui.json [production]
11:07 <eoghan@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab-runner1003.eqiad.wmnet with reason: host reimage [production]
11:02 <effie> powercycle mc-gp1001 [production]
10:59 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on puppetdb2003.codfw.wmnet with reason: master is being reimaged [production]
10:59 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on puppetdb2003.codfw.wmnet with reason: master is being reimaged [production]
10:58 <joal@deploy1002> Finished deploy [airflow-dags/analytics@dff3f3b]: Fix analytics webrequest_actor_metrics_rollup sensor (duration: 00m 13s) [production]
10:58 <joal@deploy1002> Started deploy [airflow-dags/analytics@dff3f3b]: Fix analytics webrequest_actor_metrics_rollup sensor [production]
10:57 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P43978 and previous config saved to /var/cache/conftool/dbconfig/20230209-105714-marostegui.json [production]
10:55 <eoghan@cumin1001> START - Cookbook sre.hosts.reimage for host gitlab-runner1003.eqiad.wmnet with OS bullseye [production]
10:53 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P43977 and previous config saved to /var/cache/conftool/dbconfig/20230209-105325-marostegui.json [production]
10:52 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2052.codfw.wmnet with reason: host reimage [production]
10:50 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc2052.codfw.wmnet with reason: host reimage [production]
10:42 <marostegui@cumin1001> dbctl commit (dc=all): 'db1107 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P43976 and previous config saved to /var/cache/conftool/dbconfig/20230209-104218-root.json [production]
10:42 <marostegui@cumin1001> dbctl commit (dc=all): 'db1132 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P43975 and previous config saved to /var/cache/conftool/dbconfig/20230209-104214-root.json [production]
10:42 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2157 (T328817)', diff saved to https://phabricator.wikimedia.org/P43974 and previous config saved to /var/cache/conftool/dbconfig/20230209-104208-marostegui.json [production]
10:38 <moritzm> installing containerd security updates [production]
10:38 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2114 (T329203)', diff saved to https://phabricator.wikimedia.org/P43973 and previous config saved to /var/cache/conftool/dbconfig/20230209-103819-marostegui.json [production]
10:37 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db2157 (T328817)', diff saved to https://phabricator.wikimedia.org/P43972 and previous config saved to /var/cache/conftool/dbconfig/20230209-103733-marostegui.json [production]
10:37 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance [production]
10:37 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance [production]
10:37 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T328817)', diff saved to https://phabricator.wikimedia.org/P43971 and previous config saved to /var/cache/conftool/dbconfig/20230209-103712-marostegui.json [production]
10:36 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db2114 (T329203)', diff saved to https://phabricator.wikimedia.org/P43970 and previous config saved to /var/cache/conftool/dbconfig/20230209-103604-marostegui.json [production]
10:35 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance [production]
10:35 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance [production]
10:34 <jiji@cumin1001> START - Cookbook sre.hosts.reimage for host mc-gp1001.eqiad.wmnet with OS bullseye [production]
10:34 <jiji@cumin1001> START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS bullseye [production]
10:31 <joal@deploy1002> Finished deploy [airflow-dags/analytics@2ab6564]: Analytics deploy for 3 druid jobs and webrequest_actor jobs (duration: 00m 17s) [production]
10:31 <joal@deploy1002> Started deploy [airflow-dags/analytics@2ab6564]: Analytics deploy for 3 druid jobs and webrequest_actor jobs [production]