1001-1050 of 10000 results (102ms)
2024-09-12 §
07:10 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance [production]
07:09 <jayme@cumin1002> START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw [production]
06:58 <arnaudb@cumin1002> START - Cookbook sre.mysql.clone of db2129.codfw.wmnet onto db2229.codfw.wmnet [production]
06:56 <arnaudb@cumin1002> dbctl commit (dc=all): 'Cloning db2129 in db2229 for T373579', diff saved to https://phabricator.wikimedia.org/P69015 and previous config saved to /var/cache/conftool/dbconfig/20240912-065641-arnaudb.json [production]
06:55 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579 [production]
06:55 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579 [production]
06:55 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579 [production]
06:55 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: provisionning db2229.codfw.wmnet - T373579 [production]
06:34 <jayme@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kafka-main[2004,2009].codfw.wmnet with reason: Hardware refresh [production]
06:34 <jayme@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kafka-main[2004,2009].codfw.wmnet with reason: Hardware refresh [production]
06:33 <jayme> evacuating leadership for all partitions assigned to broker id 2004 on kafka-main-codfw - T363210 [production]
06:19 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 T374421 [production]
06:19 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s3 T374421 [production]
06:16 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance [production]
06:16 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance [production]
06:16 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69014 and previous config saved to /var/cache/conftool/dbconfig/20240912-061639-ladsgroup.json [production]
06:05 <arnaudb@cumin1002> dbctl commit (dc=all): 'T374592', diff saved to https://phabricator.wikimedia.org/P69013 and previous config saved to /var/cache/conftool/dbconfig/20240912-060550-arnaudb.json [production]
06:03 <arnaudb@cumin1002> dbctl commit (dc=all): 'Promote es2038 to es7 primary and set section read-write T374592', diff saved to https://phabricator.wikimedia.org/P69012 and previous config saved to /var/cache/conftool/dbconfig/20240912-060308-arnaudb.json [production]
06:02 <arnaudb> Starting es7 codfw failover from es2039 to es2038 - T374592 [production]
06:01 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P69011 and previous config saved to /var/cache/conftool/dbconfig/20240912-060131-ladsgroup.json [production]
05:59 <arnaudb@cumin1002> dbctl commit (dc=all): 'Set es2038 with weight 0 T374592', diff saved to https://phabricator.wikimedia.org/P69010 and previous config saved to /var/cache/conftool/dbconfig/20240912-055903-arnaudb.json [production]
05:58 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es7 T374592 [production]
05:58 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es7 T374592 [production]
05:46 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P69009 and previous config saved to /var/cache/conftool/dbconfig/20240912-054624-ladsgroup.json [production]
05:44 <arnaudb@cumin1002> START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm [production]
05:31 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69008 and previous config saved to /var/cache/conftool/dbconfig/20240912-053116-ladsgroup.json [production]
04:37 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db2188 (T371742)', diff saved to https://phabricator.wikimedia.org/P69007 and previous config saved to /var/cache/conftool/dbconfig/20240912-043701-ladsgroup.json [production]
04:36 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance [production]
04:36 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance [production]
04:36 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69006 and previous config saved to /var/cache/conftool/dbconfig/20240912-043628-ladsgroup.json [production]
04:21 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P69005 and previous config saved to /var/cache/conftool/dbconfig/20240912-042121-ladsgroup.json [production]
04:06 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P69004 and previous config saved to /var/cache/conftool/dbconfig/20240912-040613-ladsgroup.json [production]
03:51 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69003 and previous config saved to /var/cache/conftool/dbconfig/20240912-035105-ladsgroup.json [production]
02:46 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db2176 (T371742)', diff saved to https://phabricator.wikimedia.org/P69002 and previous config saved to /var/cache/conftool/dbconfig/20240912-024635-ladsgroup.json [production]
02:46 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance [production]
02:46 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance [production]
02:46 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P69001 and previous config saved to /var/cache/conftool/dbconfig/20240912-024612-ladsgroup.json [production]
02:31 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P69000 and previous config saved to /var/cache/conftool/dbconfig/20240912-023105-ladsgroup.json [production]
02:15 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P68999 and previous config saved to /var/cache/conftool/dbconfig/20240912-021557-ladsgroup.json [production]
02:00 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P68998 and previous config saved to /var/cache/conftool/dbconfig/20240912-020050-ladsgroup.json [production]
00:58 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db2174 (T371742)', diff saved to https://phabricator.wikimedia.org/P68996 and previous config saved to /var/cache/conftool/dbconfig/20240912-005830-ladsgroup.json [production]
00:58 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance [production]
00:58 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance [production]
00:58 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2173 (T371742)', diff saved to https://phabricator.wikimedia.org/P68995 and previous config saved to /var/cache/conftool/dbconfig/20240912-005808-ladsgroup.json [production]
00:43 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P68994 and previous config saved to /var/cache/conftool/dbconfig/20240912-004301-ladsgroup.json [production]
00:27 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P68993 and previous config saved to /var/cache/conftool/dbconfig/20240912-002753-ladsgroup.json [production]
00:12 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2173 (T371742)', diff saved to https://phabricator.wikimedia.org/P68992 and previous config saved to /var/cache/conftool/dbconfig/20240912-001246-ladsgroup.json [production]
00:04 <eileen> civicrm upgraded from 929101dc to ac29ff45 [production]
2024-09-11 §
23:19 <jhancock@cumin2002> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART [production]
23:18 <jhancock@cumin2002> START - Cookbook sre.hosts.provision for host ganeti1041.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART [production]