151-200 of 10000 results (82ms)
2024-08-15 §
10:29 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1125.eqiad.wmnet with reason: Upgrade to 10.6.19 [production]
10:28 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on db1125.eqiad.wmnet with reason: Upgrade to 10.6.19 [production]
10:28 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc1014.eqiad.wmnet with reason: Upgrade to 10.6.19 [production]
10:28 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on pc1014.eqiad.wmnet with reason: Upgrade to 10.6.19 [production]
10:27 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2014.codfw.wmnet with reason: Upgrade to 10.6.19 [production]
10:27 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on pc2014.codfw.wmnet with reason: Upgrade to 10.6.19 [production]
10:27 <marostegui> Install 10.6.19 on pc1014 db1125 pc2014 T372536 [production]
10:26 <marostegui@cumin1002> dbctl commit (dc=all): 'db1238 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P67336 and previous config saved to /var/cache/conftool/dbconfig/20240815-102645-root.json [production]
10:21 <klausman@deploy1003> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'. [production]
10:19 <klausman@deploy1003> helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'. [production]
10:18 <jayme@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage [production]
10:15 <jayme@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage [production]
10:11 <marostegui@cumin1002> dbctl commit (dc=all): 'db1238 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P67335 and previous config saved to /var/cache/conftool/dbconfig/20240815-101139-root.json [production]
09:55 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye [production]
09:27 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2152.codfw.wmnet with reason: Schema change [production]
09:27 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2152.codfw.wmnet with reason: Schema change [production]
09:25 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2152 (T367856)', diff saved to https://phabricator.wikimedia.org/P67334 and previous config saved to /var/cache/conftool/dbconfig/20240815-092502-marostegui.json [production]
09:24 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 6:00:00 on db2152.codfw.wmnet with reason: Maintenance [production]
09:24 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 6:00:00 on db2152.codfw.wmnet with reason: Maintenance [production]
08:55 <jayme@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS bullseye [production]
08:04 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye [production]
08:00 <jayme@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS bullseye [production]
07:47 <jayme@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye [production]
07:31 <ryankemper@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 10:00:00 on 9 hosts with reason: T364368 non-prod hosts [production]
07:31 <ryankemper@cumin2002> START - Cookbook sre.hosts.downtime for 3 days, 10:00:00 on 9 hosts with reason: T364368 non-prod hosts [production]
07:09 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye [production]
06:37 <marostegui@cumin1002> dbctl commit (dc=all): 'db1223 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P67333 and previous config saved to /var/cache/conftool/dbconfig/20240815-063734-root.json [production]
06:22 <marostegui@cumin1002> dbctl commit (dc=all): 'db1223 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P67332 and previous config saved to /var/cache/conftool/dbconfig/20240815-062229-root.json [production]
06:07 <marostegui@cumin1002> dbctl commit (dc=all): 'db1223 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P67331 and previous config saved to /var/cache/conftool/dbconfig/20240815-060723-root.json [production]
05:52 <marostegui@cumin1002> dbctl commit (dc=all): 'db1223 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P67330 and previous config saved to /var/cache/conftool/dbconfig/20240815-055218-root.json [production]
05:37 <marostegui@cumin1002> dbctl commit (dc=all): 'db1223 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P67329 and previous config saved to /var/cache/conftool/dbconfig/20240815-053712-root.json [production]
05:22 <marostegui@cumin1002> dbctl commit (dc=all): 'db1223 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P67328 and previous config saved to /var/cache/conftool/dbconfig/20240815-052206-root.json [production]
05:07 <marostegui@cumin1002> dbctl commit (dc=all): 'db1223 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P67327 and previous config saved to /var/cache/conftool/dbconfig/20240815-050701-root.json [production]
05:06 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db1223 T372393', diff saved to https://phabricator.wikimedia.org/P67326 and previous config saved to /var/cache/conftool/dbconfig/20240815-050613-root.json [production]
05:04 <marostegui@cumin1002> dbctl commit (dc=all): 'Promote db1189 to s3 primary and set section read-write T372393', diff saved to https://phabricator.wikimedia.org/P67325 and previous config saved to /var/cache/conftool/dbconfig/20240815-050428-root.json [production]
05:04 <marostegui@cumin1002> dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - T372393', diff saved to https://phabricator.wikimedia.org/P67324 and previous config saved to /var/cache/conftool/dbconfig/20240815-050410-root.json [production]
05:03 <marostegui> Starting s3 eqiad failover from db1223 to db1189 - T372393 [production]
04:55 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Stop MariaDB on db1238 T371342 [production]
04:55 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Stop MariaDB on db1238 T371342 [production]
04:49 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s3 T372393 [production]
04:49 <marostegui@cumin1002> dbctl commit (dc=all): 'Set db1189 with weight 0 T372393', diff saved to https://phabricator.wikimedia.org/P67323 and previous config saved to /var/cache/conftool/dbconfig/20240815-044929-root.json [production]
04:49 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s3 T372393 [production]
03:26 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
03:26 <pt1979@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix mgmt DNS fro fd2004 - pt1979@cumin2002" [production]
03:26 <pt1979@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix mgmt DNS fro fd2004 - pt1979@cumin2002" [production]
03:22 <pt1979@cumin2002> START - Cookbook sre.dns.netbox [production]
02:24 <milimetric@deploy1003> Finished deploy [airflow-dags/analytics@02f37cf]: (no justification provided) (duration: 00m 43s) [production]
02:23 <milimetric@deploy1003> Started deploy [airflow-dags/analytics@02f37cf]: (no justification provided) [production]
2024-08-14 §
23:34 <ebernhardson@deploy1003> helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
23:33 <ebernhardson@deploy1003> helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply [production]