1301-1350 of 10000 results (66ms)
2022-06-30 §
07:15 <marostegui@cumin1001> dbctl commit (dc=all): 'db1103 (re)pooling @ 1%: After reimage', diff saved to https://phabricator.wikimedia.org/P30641 and previous config saved to /var/cache/conftool/dbconfig/20220630-071526-root.json [production]
07:15 <marostegui@cumin1001> dbctl commit (dc=all): 'db1103 weight', diff saved to https://phabricator.wikimedia.org/P30640 and previous config saved to /var/cache/conftool/dbconfig/20220630-071522-marostegui.json [production]
07:11 <marostegui@cumin1001> dbctl commit (dc=all): 'db1173 (re)pooling @ 2%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P30639 and previous config saved to /var/cache/conftool/dbconfig/20220630-071125-root.json [production]
06:51 <marostegui@cumin1001> dbctl commit (dc=all): 'db1173 (re)pooling @ 2%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P30636 and previous config saved to /var/cache/conftool/dbconfig/20220630-065126-root.json [production]
06:37 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1103.eqiad.wmnet with reason: host reimage [production]
06:36 <marostegui@cumin1001> dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: After on-site maintenance', diff saved to https://phabricator.wikimedia.org/P30635 and previous config saved to /var/cache/conftool/dbconfig/20220630-063622-root.json [production]
06:33 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1103.eqiad.wmnet with reason: host reimage [production]
06:06 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db1120 to x1 primary and set section read-write T300472', diff saved to https://phabricator.wikimedia.org/P30633 and previous config saved to /var/cache/conftool/dbconfig/20220630-060601-root.json [production]
06:03 <marostegui> Starting x1 eqiad failover from db1103 to db1120 - T300472 [production]
05:23 <eileen> civicrm upgraded from 9e5a5310 to 55bc690b [production]
05:17 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db1120 with weight 0 T300472', diff saved to https://phabricator.wikimedia.org/P30632 and previous config saved to /var/cache/conftool/dbconfig/20220630-051730-root.json [production]
05:17 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T300472 [production]
05:17 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T300472 [production]
02:59 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2160.codfw.wmnet with OS bullseye [production]
02:58 <eileen> civicrm upgraded from f48fe112 to 9e5a5310 [production]
02:50 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) [production]
02:49 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:49 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) [production]
02:48 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:48 <bmansurov@deploy1002> deploy aborted: (no justification provided) (duration: 00m 02s) [production]
02:48 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:47 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 03s) [production]
02:47 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:18 <bmansurov@deploy1002> Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 08s) [production]
02:18 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:17 <bmansurov@deploy1002> deploy aborted: (no justification provided) (duration: 02m 03s) [production]
02:15 <bmansurov@deploy1002> Started deploy [airflow-dags/research@b3fe77c]: (no justification provided) [production]
02:11 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS bullseye [production]
01:48 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS bullseye [production]
01:36 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bullseye [production]
01:34 <eileen> civicrm upgraded from 3cb5e6dd to f48fe112 [production]
01:32 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage [production]
01:27 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage [production]
01:20 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage [production]
01:17 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage [production]
01:07 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS bullseye [production]
00:58 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2155.codfw.wmnet with OS bullseye [production]
00:58 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS bullseye [production]
00:49 <ebernhardson> T310924 Cleared eqiad chi->omega cross cluster settings and reapplied [production]
00:32 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2157.codfw.wmnet with OS bullseye [production]
00:18 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2157.codfw.wmnet with reason: host reimage [production]
00:14 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2157.codfw.wmnet with reason: host reimage [production]
2022-06-29 §
23:56 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2155.codfw.wmnet with OS bullseye [production]
23:55 <pt1979@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2154.codfw.wmnet with OS bullseye [production]
23:55 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2157.codfw.wmnet with OS bullseye [production]
23:53 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS bullseye [production]
23:50 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host db2155.codfw.wmnet with OS bullseye [production]
23:34 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host stat1009.mgmt.eqiad.wmnet with reboot policy FORCED [production]
23:34 <cmjohnson@cumin1001> START - Cookbook sre.hosts.provision for host stat1009.mgmt.eqiad.wmnet with reboot policy FORCED [production]
23:30 <bking@cumin1001> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster restart to pickup swift-s3 plugin - bking@cumin1001 - T309648 [production]