production SAL

101-150 of 10000 results (70ms)

2023-05-08 §
19:12	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: rebooting to help with lag	[production]
19:12	<bking@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on 19 hosts with reason: rebooting to help with lag	[production]
19:01	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P47931 and previous config saved to /var/cache/conftool/dbconfig/20230508-190100-ladsgroup.json	[production]
18:45	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P47930 and previous config saved to /var/cache/conftool/dbconfig/20230508-184554-ladsgroup.json	[production]
18:30	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1168 (T335845)', diff saved to https://phabricator.wikimedia.org/P47929 and previous config saved to /var/cache/conftool/dbconfig/20230508-183048-ladsgroup.json	[production]
18:23	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1168 (T335845)', diff saved to https://phabricator.wikimedia.org/P47928 and previous config saved to /var/cache/conftool/dbconfig/20230508-182350-ladsgroup.json	[production]
18:23	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance	[production]
18:23	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance	[production]
18:23	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165 (T335845)', diff saved to https://phabricator.wikimedia.org/P47927 and previous config saved to /var/cache/conftool/dbconfig/20230508-182327-ladsgroup.json	[production]
18:09	<otto@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
18:08	<otto@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
18:08	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P47926 and previous config saved to /var/cache/conftool/dbconfig/20230508-180820-ladsgroup.json	[production]
18:07	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance	[production]
18:07	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance	[production]
18:04	<sukhe@deploy1002>	Unlocked for deployment [ALL REPOSITORIES]: LVS reimaging in codfw, blocking deploys T326767 (duration: 113m 03s)	[production]
18:03	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafka-logging2002.codfw.wmnet	[production]
18:02	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance	[production]
18:02	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance	[production]
18:02	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1214 (T335845)', diff saved to https://phabricator.wikimedia.org/P47925 and previous config saved to /var/cache/conftool/dbconfig/20230508-180239-ladsgroup.json	[production]
17:57	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kafka-logging2002.codfw.wmnet	[production]
17:54	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafka-logging2001.codfw.wmnet	[production]
17:53	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P47923 and previous config saved to /var/cache/conftool/dbconfig/20230508-175314-ladsgroup.json	[production]
17:51	<sukhe>	set routing-options static route 208.80.153.224/28 [high-traffic1, codfw] next-hop 10.192.0.29: T326767	[production]
17:48	<herron@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kafka-logging2001.codfw.wmnet	[production]
17:48	<sukhe>	restart pybal on lvs2011 to pick up bgp med change: T326767	[production]
17:47	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P47922 and previous config saved to /var/cache/conftool/dbconfig/20230508-174732-ladsgroup.json	[production]
17:39	<sukhe>	homer "cr-codfw" commit "Gerrit: 914871 add new LVS host lvs2011": T326767	[production]
17:38	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1165 (T335845)', diff saved to https://phabricator.wikimedia.org/P47920 and previous config saved to /var/cache/conftool/dbconfig/20230508-173808-ladsgroup.json	[production]
17:38	<volans>	installed spicerack 7.0.0 on cumin1001	[production]
17:36	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1001.eqiad.wmnet	[production]
17:36	<bking@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
17:36	<bking@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin1001"	[production]
17:35	<sukhe@cumin2002>	END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2011	[production]
17:35	<sukhe@cumin2002>	START - Cookbook sre.network.configure-switch-interfaces for host lvs2011	[production]
17:33	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet	[production]
17:33	<sukhe@cumin2002>	START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet	[production]
17:32	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P47919 and previous config saved to /var/cache/conftool/dbconfig/20230508-173226-ladsgroup.json	[production]
17:31	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1165 (T335845)', diff saved to https://phabricator.wikimedia.org/P47918 and previous config saved to /var/cache/conftool/dbconfig/20230508-173152-ladsgroup.json	[production]
17:31	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
17:31	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
17:31	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance	[production]
17:31	<bking@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - bking@cumin1001"	[production]
17:31	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance	[production]
17:31	<stevemunene@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1132.eqiad.wmnet with OS buster	[production]
17:29	<volans@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin2002.codfw.wmnet with reason: test spicerack v7.0.0	[production]
17:29	<volans@cumin2002>	START - Cookbook sre.hosts.downtime for 0:05:00 on cumin2002.codfw.wmnet with reason: test spicerack v7.0.0	[production]
17:28	<volans>	installed spicerack 7.0.0 on cumin2002	[production]
17:28	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudswift1002.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
17:28	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance	[production]
17:27	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudswift1001.mgmt.eqiad.wmnet with reboot policy FORCED	[production]