2024-09-12
§
|
07:09 |
<jayme@cumin1002> |
START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw |
[production] |
06:58 |
<arnaudb@cumin1002> |
START - Cookbook sre.mysql.clone of db2129.codfw.wmnet onto db2229.codfw.wmnet |
[production] |
06:56 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Cloning db2129 in db2229 for T373579', diff saved to https://phabricator.wikimedia.org/P69015 and previous config saved to /var/cache/conftool/dbconfig/20240912-065641-arnaudb.json |
[production] |
06:55 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579 |
[production] |
06:55 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579 |
[production] |
06:55 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579 |
[production] |
06:55 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579 |
[production] |
06:34 |
<jayme@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[2004,2009].codfw.wmnet with reason: Hardware refresh |
[production] |
06:34 |
<jayme@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[2004,2009].codfw.wmnet with reason: Hardware refresh |
[production] |
06:33 |
<jayme> |
evacuating leadership for all partitions assigned to broker id 2004 on kafka-main-codfw - T363210 |
[production] |
06:19 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 T374421 |
[production] |
06:19 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s3 T374421 |
[production] |
06:16 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance |
[production] |
06:16 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance |
[production] |
06:16 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69014 and previous config saved to /var/cache/conftool/dbconfig/20240912-061639-ladsgroup.json |
[production] |
06:05 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'T374592', diff saved to https://phabricator.wikimedia.org/P69013 and previous config saved to /var/cache/conftool/dbconfig/20240912-060550-arnaudb.json |
[production] |
06:03 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Promote es2038 to es7 primary and set section read-write T374592', diff saved to https://phabricator.wikimedia.org/P69012 and previous config saved to /var/cache/conftool/dbconfig/20240912-060308-arnaudb.json |
[production] |
06:02 |
<arnaudb> |
Starting es7 codfw failover from es2039 to es2038 - T374592 |
[production] |
06:01 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P69011 and previous config saved to /var/cache/conftool/dbconfig/20240912-060131-ladsgroup.json |
[production] |
05:59 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Set es2038 with weight 0 T374592', diff saved to https://phabricator.wikimedia.org/P69010 and previous config saved to /var/cache/conftool/dbconfig/20240912-055903-arnaudb.json |
[production] |
05:58 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es7 T374592 |
[production] |
05:58 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es7 T374592 |
[production] |
05:46 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P69009 and previous config saved to /var/cache/conftool/dbconfig/20240912-054624-ladsgroup.json |
[production] |
05:44 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm |
[production] |
05:31 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69008 and previous config saved to /var/cache/conftool/dbconfig/20240912-053116-ladsgroup.json |
[production] |
04:37 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depooling db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69007 and previous config saved to /var/cache/conftool/dbconfig/20240912-043701-ladsgroup.json |
[production] |
04:36 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance |
[production] |
04:36 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance |
[production] |
04:36 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69006 and previous config saved to /var/cache/conftool/dbconfig/20240912-043628-ladsgroup.json |
[production] |
04:21 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P69005 and previous config saved to /var/cache/conftool/dbconfig/20240912-042121-ladsgroup.json |
[production] |
04:06 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P69004 and previous config saved to /var/cache/conftool/dbconfig/20240912-040613-ladsgroup.json |
[production] |
03:51 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69003 and previous config saved to /var/cache/conftool/dbconfig/20240912-035105-ladsgroup.json |
[production] |
02:46 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depooling db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69002 and previous config saved to /var/cache/conftool/dbconfig/20240912-024635-ladsgroup.json |
[production] |
02:46 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance |
[production] |
02:46 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance |
[production] |
02:46 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P69001 and previous config saved to /var/cache/conftool/dbconfig/20240912-024612-ladsgroup.json |
[production] |
02:31 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P69000 and previous config saved to /var/cache/conftool/dbconfig/20240912-023105-ladsgroup.json |
[production] |
02:15 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P68999 and previous config saved to /var/cache/conftool/dbconfig/20240912-021557-ladsgroup.json |
[production] |
02:00 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P68998 and previous config saved to /var/cache/conftool/dbconfig/20240912-020050-ladsgroup.json |
[production] |
00:58 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depooling db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P68996 and previous config saved to /var/cache/conftool/dbconfig/20240912-005830-ladsgroup.json |
[production] |
00:58 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance |
[production] |
00:58 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance |
[production] |
00:58 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173 (T371742)', diff saved to https://phabricator.wikimedia.org/P68995 and previous config saved to /var/cache/conftool/dbconfig/20240912-005808-ladsgroup.json |
[production] |
00:43 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P68994 and previous config saved to /var/cache/conftool/dbconfig/20240912-004301-ladsgroup.json |
[production] |
00:27 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P68993 and previous config saved to /var/cache/conftool/dbconfig/20240912-002753-ladsgroup.json |
[production] |
00:12 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173 (T371742)', diff saved to https://phabricator.wikimedia.org/P68992 and previous config saved to /var/cache/conftool/dbconfig/20240912-001246-ladsgroup.json |
[production] |
00:04 |
<eileen> |
civicrm upgraded from 929101dc to ac29ff45 |
[production] |