2451-2500 of 10000 results (98ms)
2024-05-06 §
06:37 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage [production]
06:35 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage [production]
06:30 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2102.codfw.wmnet with reason: Maintenance [production]
06:30 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2102.codfw.wmnet with reason: Maintenance [production]
06:30 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS bookworm [production]
06:28 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db1193', diff saved to https://phabricator.wikimedia.org/P61871 and previous config saved to /var/cache/conftool/dbconfig/20240506-062814-root.json [production]
06:17 <sfaci@deploy1002> helmfile [staging] DONE helmfile.d/services/editor-analytics: apply [production]
06:17 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS bookworm [production]
06:14 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2165 T363977', diff saved to https://phabricator.wikimedia.org/P61870 and previous config saved to /var/cache/conftool/dbconfig/20240506-061416-root.json [production]
06:13 <marostegui@cumin1002> dbctl commit (dc=all): 'Promote db2161 to s8 primary T363977', diff saved to https://phabricator.wikimedia.org/P61869 and previous config saved to /var/cache/conftool/dbconfig/20240506-061311-marostegui.json [production]
06:12 <marostegui> Starting s8 codfw failover from db2165 to db2161 - T363977 [production]
06:07 <sfaci@deploy1002> helmfile [staging] START helmfile.d/services/editor-analytics: apply [production]
05:50 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T363977 [production]
05:50 <marostegui@cumin1002> dbctl commit (dc=all): 'Set db2161 with weight 0 T363977', diff saved to https://phabricator.wikimedia.org/P61868 and previous config saved to /var/cache/conftool/dbconfig/20240506-055013-root.json [production]
05:50 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T363977 [production]
05:26 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2165.codfw.wmnet with reason: Maintenance [production]
05:26 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2165.codfw.wmnet with reason: Maintenance [production]
2024-05-05 §
11:09 <brennen@deploy1002> Finished deploy [phabricator/deployment@dd53761]: test deploy phab1004 for T364271 (duration: 00m 32s) [production]
11:08 <brennen@deploy1002> Started deploy [phabricator/deployment@dd53761]: test deploy phab1004 for T364271 [production]
11:08 <brennen@deploy1002> Finished deploy [phabricator/deployment@dd53761]: test deploy phab2002 for T364271 (duration: 00m 32s) [production]
11:07 <brennen@deploy1002> Started deploy [phabricator/deployment@dd53761]: test deploy phab2002 for T364271 [production]
11:04 <taavi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab.wmfusercontent.org with reason: brennen is deploying things [production]
11:03 <taavi@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on phab.wmfusercontent.org with reason: brennen is deploying things [production]
11:03 <taavi@cumin1002> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on phabricator.wikimedia.org with reason: brennen is deploying things [production]
11:03 <taavi@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on phabricator.wikimedia.org with reason: brennen is deploying things [production]
11:03 <taavi@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: brennen is deploying things [production]
11:03 <taavi@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on phab1004.eqiad.wmnet with reason: brennen is deploying things [production]
08:42 <taavi> taavi@gerrit1003 ~ $ sudo systemctl restart apache2 [production]
2024-05-04 §
13:41 <jayme> doubled the number of eventgate-main replicas in eqiad to 16 [production]
07:39 <taavi@cumin1002> END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0) [production]
07:33 <taavi@cumin1002> START - Cookbook sre.wikireplicas.update-views [production]
03:07 <denisse> Restarting `status curator_actions_cluster_wide.service` to log with DEBUGG level on logstash2026 - T364190 [production]
03:06 <denisse> Enable log level DEBUG for curator on logstash2026 - T364190 [production]
01:33 <bblack@cumin1002> conftool action : set/weight=100; selector: name=dns7.* [production]
01:24 <bblack> lvs7001 - restart pybal [production]
01:23 <bblack> lvs7003 - restart pybal [production]
2024-05-03 §
21:38 <ryankemper@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920 [production]
21:38 <ryankemper@cumin2002> START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on wdqs2023.codfw.wmnet with reason: T362920 [production]
21:27 <ryankemper> T362920 [wdqs] Depooled `wdqs2023` in preparation to switch it to a graph split host [production]
19:02 <sukhe> cleaning up stale confd template files for magru related reimaging [production]
18:44 <brett@cumin2002> conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx [production]
18:43 <brett@cumin2002> conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx [production]
18:38 <brett@cumin2002> conftool action : set/pooled=no; selector: name=ncredir7001.magru.wmnet,service=nginx [production]
18:38 <brett@cumin2002> conftool action : set/pooled=no; selector: name=ncredir7002.magru.wmnet,service=nginx [production]
18:29 <brett@cumin2002> conftool action : set/pooled=yes; selector: name=ncredir7002.magru.wmnet,service=nginx [production]
18:29 <brett@cumin2002> conftool action : set/weight=1; selector: name=ncredir7002.magru.wmnet,service=nginx [production]
18:29 <brett@cumin2002> conftool action : set/pooled=yes; selector: name=ncredir7001.magru.wmnet,service=nginx [production]
18:28 <brett@cumin2002> conftool action : set/weight=1; selector: name=ncredir7001.magru.wmnet,service=nginx [production]
17:45 <dcausse> repooling wdqs1012 [production]
17:27 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet with reason: Maintenance [production]