3401-3450 of 10000 results (62ms)
2022-06-30 §
06:33 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1103.eqiad.wmnet with reason: host reimage [production]
06:06 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db1120 to x1 primary and set section read-write T300472', diff saved to https://phabricator.wikimedia.org/P30633 and previous config saved to /var/cache/conftool/dbconfig/20220630-060601-root.json [production]
06:03 <marostegui> Starting x1 eqiad failover from db1103 to db1120 - T300472 [production]
05:23 <eileen> civicrm upgraded from 9e5a5310 to 55bc690b [production]
05:17 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db1120 with weight 0 T300472', diff saved to https://phabricator.wikimedia.org/P30632 and previous config saved to /var/cache/conftool/dbconfig/20220630-051730-root.json [production]
05:17 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T300472 [production]
05:17 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T300472 [production]
02:59 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2160.codfw.wmnet with OS bullseye [production]
02:58 <eileen> civicrm upgraded from f48fe112 to 9e5a5310 [production]
02:50 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) [production]
02:49 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:49 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) [production]
02:48 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:48 <bmansurov@deploy1002> deploy aborted: (no justification provided) (duration: 00m 02s) [production]
02:48 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:47 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 03s) [production]
02:47 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:18 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) [production]
02:18 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:17 <bmansurov@deploy1002> deploy aborted: (no justification provided) (duration: 02m 03s) [production]
02:15 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:11 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS bullseye [production]
01:48 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS bullseye [production]
01:36 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bullseye [production]
01:34 <eileen> civicrm upgraded from 3cb5e6dd to f48fe112 [production]
01:32 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage [production]
01:27 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage [production]
01:20 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage [production]
01:17 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage [production]
01:07 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS bullseye [production]
00:58 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2155.codfw.wmnet with OS bullseye [production]
00:58 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS bullseye [production]
00:49 <ebernhardson> T310924 Cleared eqiad chi->omega cross cluster settings and reapplied [production]
00:32 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2157.codfw.wmnet with OS bullseye [production]
00:18 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2157.codfw.wmnet with reason: host reimage [production]
00:14 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2157.codfw.wmnet with reason: host reimage [production]
2022-06-29 §
23:56 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2155.codfw.wmnet with OS bullseye [production]
23:55 <pt1979@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2154.codfw.wmnet with OS bullseye [production]
23:55 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2157.codfw.wmnet with OS bullseye [production]
23:53 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS bullseye [production]
23:50 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host db2155.codfw.wmnet with OS bullseye [production]
23:34 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host stat1009.mgmt.eqiad.wmnet with reboot policy FORCED [production]
23:34 <cmjohnson@cumin1001> START - Cookbook sre.hosts.provision for host stat1009.mgmt.eqiad.wmnet with reboot policy FORCED [production]
23:30 <bking@cumin1001> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster restart to pickup swift-s3 plugin - bking@cumin1001 - T309648 [production]
23:05 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2155.codfw.wmnet with reason: host reimage [production]
23:01 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2155.codfw.wmnet with reason: host reimage [production]
22:41 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2155.codfw.wmnet with OS bullseye [production]
22:37 <cmjohnson@cumin1001> START - Cookbook sre.hosts.provision for host stat1009.mgmt.eqiad.wmnet with reboot policy FORCED [production]
22:34 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudnet1006.mgmt.eqiad.wmnet with reboot policy FORCED [production]
22:31 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1477.eqiad.wmnet with OS buster [production]