production SAL

3151-3200 of 10000 results (83ms)

2023-03-07 §
14:26	<akosiaris@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
14:25	<akosiaris@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
14:25	<akosiaris@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
14:24	<akosiaris@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
14:24	<akosiaris@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
14:21	<akosiaris@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
14:21	<akosiaris@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
14:21	<akosiaris@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
14:20	<akosiaris@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
14:20	<topranks>	issuing reboot to upgrade asw2-a-eqiad virtual-chassis to Junos 21.4	[production]
14:20	<akosiaris@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
14:19	<akosiaris@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
14:19	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1038']	[production]
14:17	<akosiaris@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubernetes1020.eqiad.wmnet with OS bullseye	[production]
14:16	<cmooney@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-eqiad with reason: eqiad row A upgrade	[production]
14:16	<cmooney@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mr1-eqiad with reason: eqiad row A upgrade	[production]
14:15	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1037']	[production]
14:13	<akosiaris>	kubectl cordon kubernetes{1005,1007,1008,1017,1018}.eqiad.wmnet T329073	[production]
14:13	<mvernon@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2070.codfw.wmnet with OS bullseye	[production]
14:12	<mvernon@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin1001"	[production]
14:09	<cmjohnson@cumin1001>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1038']	[production]
14:09	<cmooney@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 238 hosts with reason: eqiad row A upgrade	[production]
14:09	<cmjohnson@cumin1001>	END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1038']	[production]
14:09	<cmjohnson@cumin1001>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1038']	[production]
14:08	<akosiaris@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: host reimage	[production]
14:08	<akosiaris@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: host reimage	[production]
14:07	<cmjohnson@cumin1001>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1037']	[production]
14:07	<cmooney@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on 238 hosts with reason: eqiad row A upgrade	[production]
14:05	<hnowlan@puppetmaster1001>	conftool action : set/pooled=no; selector: name=restbase1031.eqiad.wmnet	[production]
14:05	<hnowlan@puppetmaster1001>	conftool action : set/pooled=no; selector: name=restbase102[18].eqiad.wmnet	[production]
14:05	<hnowlan@puppetmaster1001>	conftool action : set/pooled=no; selector: name=restbase101[69].eqiad.wmnet	[production]
14:02	<mvernon@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin1001"	[production]
13:59	<jbond>	failover pki.discovery.wmnet to codfw T329073	[production]
13:58	<Emperor>	depool thanos-fe1001 T329073	[production]
13:55	<Emperor>	depool ms-fe1009 T329073	[production]
13:55	<Emperor>	depool moss-fe1001 T329073	[production]
13:54	<akosiaris@cumin1001>	START - Cookbook sre.hosts.reimage for host kubernetes1020.eqiad.wmnet with OS bullseye	[production]
13:50	<moritzm>	disabling Puppet in eqiad/esams/drmrs for forthcoming Switch maintenance T329073	[production]
13:50	<topranks>	staging Junos files to individual VC members eqiad row A to prep for upgrade	[production]
13:15	<otto@deploy2002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
13:15	<otto@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
13:14	<akosiaris@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1019.eqiad.wmnet with OS bullseye	[production]
13:04	<moritzm>	drain ganeti1011 for eventual reimage to Bullseye T311687	[production]
13:00	<akosiaris@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1018.eqiad.wmnet with OS bullseye	[production]
12:57	<sukhe>	removing dns1001 from authdns_servers for T329073	[production]
12:55	<akosiaris@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: host reimage	[production]
12:52	<akosiaris@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: host reimage	[production]
12:44	<akosiaris@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: host reimage	[production]
12:41	<akosiaris@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: host reimage	[production]
12:38	<akosiaris@cumin1001>	START - Cookbook sre.hosts.reimage for host kubernetes1019.eqiad.wmnet with OS bullseye	[production]