2021-03-09
ยง
|
14:28 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1012.eqiad.wmnet with reason: REIMAGE |
[production] |
14:27 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host ms-fe2007.codfw.wmnet |
[production] |
14:27 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1014.eqiad.wmnet with reason: REIMAGE |
[production] |
14:26 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1012.eqiad.wmnet with reason: REIMAGE |
[production] |
14:25 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2006.codfw.wmnet |
[production] |
14:21 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host ms-fe2006.codfw.wmnet |
[production] |
14:17 |
<jakob@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' . |
[production] |
14:15 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 60%: Repooling after schema change', diff saved to https://phabricator.wikimedia.org/P14696 and previous config saved to /var/cache/conftool/dbconfig/20210309-141529-root.json |
[production] |
14:14 |
<jakob@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'production' . |
[production] |
14:12 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2005.codfw.wmnet |
[production] |
14:11 |
<jakob@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'test' . |
[production] |
14:10 |
<moritzm> |
installing intel-microcode updates on stretch |
[production] |
14:09 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host ms-fe2005.codfw.wmnet |
[production] |
14:08 |
<jakob@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' . |
[production] |
14:07 |
<jgleeson> |
updated smashpig from 5a69abd40f to 58b070db1a |
[production] |
14:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 30%: Repooling after schema change', diff saved to https://phabricator.wikimedia.org/P14694 and previous config saved to /var/cache/conftool/dbconfig/20210309-140025-root.json |
[production] |
13:52 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1004.eqiad.wmnet |
[production] |
13:52 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1102.eqiad.wmnet with reason: REIMAGE |
[production] |
13:50 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1080.eqiad.wmnet with reason: REIMAGE |
[production] |
13:49 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1102.eqiad.wmnet with reason: REIMAGE |
[production] |
13:49 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1080.eqiad.wmnet with reason: REIMAGE |
[production] |
13:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 10%: Repooling after schema change', diff saved to https://phabricator.wikimedia.org/P14693 and previous config saved to /var/cache/conftool/dbconfig/20210309-134522-root.json |
[production] |
13:37 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host prometheus1004.eqiad.wmnet |
[production] |
13:34 |
<aborrero@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cloudvirt1038.eqiad.wmnet with reason: HW issue |
[production] |
13:34 |
<aborrero@cumin1001> |
START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on cloudvirt1038.eqiad.wmnet with reason: HW issue |
[production] |
13:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P14692 and previous config saved to /var/cache/conftool/dbconfig/20210309-133124-root.json |
[production] |
13:28 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1003.eqiad.wmnet |
[production] |
13:27 |
<elukey> |
reimage an-worker1102 and an-worker1080 (hdfs journal node) to Buster |
[production] |
13:21 |
<jgleeson> |
updated payments-wiki from 65dbf0ed9d to 0e7800027a |
[production] |
13:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1198:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P14691 and previous config saved to /var/cache/conftool/dbconfig/20210309-131652-marostegui.json |
[production] |
13:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1168 (re)pooling @ 60%: 10', diff saved to https://phabricator.wikimedia.org/P14690 and previous config saved to /var/cache/conftool/dbconfig/20210309-131620-root.json |
[production] |
13:10 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host prometheus1003.eqiad.wmnet |
[production] |
13:08 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1103.eqiad.wmnet with reason: REIMAGE |
[production] |
13:06 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1103.eqiad.wmnet with reason: REIMAGE |
[production] |
13:03 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1013.eqiad.wmnet with reason: REIMAGE |
[production] |
13:01 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1013.eqiad.wmnet with reason: REIMAGE |
[production] |
13:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1168 (re)pooling @ 30%: 10', diff saved to https://phabricator.wikimedia.org/P14689 and previous config saved to /var/cache/conftool/dbconfig/20210309-130116-root.json |
[production] |
12:59 |
<elukey> |
drain + reimage an-worker1103 to Buster |
[production] |
12:59 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1011.eqiad.wmnet with reason: REIMAGE |
[production] |
12:57 |
<hnowlan@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1011.eqiad.wmnet with reason: REIMAGE |
[production] |
12:56 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw1403.eqiad.wmnet |
[production] |
12:56 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw1402.eqiad.wmnet |
[production] |
12:50 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1168 for schema change', diff saved to https://phabricator.wikimedia.org/P14688 and previous config saved to /var/cache/conftool/dbconfig/20210309-125007-marostegui.json |
[production] |
12:49 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P14687 and previous config saved to /var/cache/conftool/dbconfig/20210309-124931-root.json |
[production] |
12:41 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host mw1403.eqiad.wmnet |
[production] |
12:41 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host mw1402.eqiad.wmnet |
[production] |
12:38 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
12:34 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1173 (re)pooling @ 60%: 10', diff saved to https://phabricator.wikimedia.org/P14686 and previous config saved to /var/cache/conftool/dbconfig/20210309-123427-root.json |
[production] |
12:33 |
<aborrero@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host cloudvirt1038.eqiad.wmnet |
[production] |
12:31 |
<hnowlan@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |