2022-06-30
§
|
06:33 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1103.eqiad.wmnet with reason: host reimage |
[production] |
06:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote db1120 to x1 primary and set section read-write T300472', diff saved to https://phabricator.wikimedia.org/P30633 and previous config saved to /var/cache/conftool/dbconfig/20220630-060601-root.json |
[production] |
06:03 |
<marostegui> |
Starting x1 eqiad failover from db1103 to db1120 - T300472 |
[production] |
05:23 |
<eileen> |
civicrm upgraded from 9e5a5310 to 55bc690b |
[production] |
05:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set db1120 with weight 0 T300472', diff saved to https://phabricator.wikimedia.org/P30632 and previous config saved to /var/cache/conftool/dbconfig/20220630-051730-root.json |
[production] |
05:17 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T300472 |
[production] |
05:17 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T300472 |
[production] |
02:59 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2160.codfw.wmnet with OS bullseye |
[production] |
02:58 |
<eileen> |
civicrm upgraded from f48fe112 to 9e5a5310 |
[production] |
02:50 |
<bmansurov@deploy1002> |
Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) |
[production] |
02:49 |
<bmansurov@deploy1002> |
Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) |
[production] |
02:49 |
<bmansurov@deploy1002> |
Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) |
[production] |
02:48 |
<bmansurov@deploy1002> |
Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) |
[production] |
02:48 |
<bmansurov@deploy1002> |
deploy aborted: (no justification provided) (duration: 00m 02s) |
[production] |
02:48 |
<bmansurov@deploy1002> |
Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) |
[production] |
02:47 |
<bmansurov@deploy1002> |
Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 03s) |
[production] |
02:47 |
<bmansurov@deploy1002> |
Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) |
[production] |
02:18 |
<bmansurov@deploy1002> |
Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) |
[production] |
02:18 |
<bmansurov@deploy1002> |
Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) |
[production] |
02:17 |
<bmansurov@deploy1002> |
deploy aborted: (no justification provided) (duration: 02m 03s) |
[production] |
02:15 |
<bmansurov@deploy1002> |
Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) |
[production] |
02:11 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS bullseye |
[production] |
01:48 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS bullseye |
[production] |
01:36 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bullseye |
[production] |
01:34 |
<eileen> |
civicrm upgraded from 3cb5e6dd to f48fe112 |
[production] |
01:32 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage |
[production] |
01:27 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage |
[production] |
01:20 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage |
[production] |
01:17 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage |
[production] |
01:07 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS bullseye |
[production] |
00:58 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2155.codfw.wmnet with OS bullseye |
[production] |
00:58 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS bullseye |
[production] |
00:49 |
<ebernhardson> |
T310924 Cleared eqiad chi->omega cross cluster settings and reapplied |
[production] |
00:32 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2157.codfw.wmnet with OS bullseye |
[production] |
00:18 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2157.codfw.wmnet with reason: host reimage |
[production] |
00:14 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2157.codfw.wmnet with reason: host reimage |
[production] |
2022-06-29
§
|
23:56 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host db2155.codfw.wmnet with OS bullseye |
[production] |
23:55 |
<pt1979@cumin2002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2154.codfw.wmnet with OS bullseye |
[production] |
23:55 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host db2157.codfw.wmnet with OS bullseye |
[production] |
23:53 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS bullseye |
[production] |
23:50 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host db2155.codfw.wmnet with OS bullseye |
[production] |
23:34 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host stat1009.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
23:34 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.provision for host stat1009.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
23:30 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster restart to pickup swift-s3 plugin - bking@cumin1001 - T309648 |
[production] |
23:05 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2155.codfw.wmnet with reason: host reimage |
[production] |
23:01 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2155.codfw.wmnet with reason: host reimage |
[production] |
22:41 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.reimage for host db2155.codfw.wmnet with OS bullseye |
[production] |
22:37 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.provision for host stat1009.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
22:34 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudnet1006.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
22:31 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1477.eqiad.wmnet with OS buster |
[production] |