production SAL

1151-1200 of 10000 results (74ms)

2023-09-11 §
17:59	<eevans@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - eevans@cumin1001"	[production]
17:58	<eevans@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - eevans@cumin1001"	[production]
17:53	<denisse@cumin1001>	START - Cookbook sre.hosts.reimage for host netmon2002.wikimedia.org with OS bullseye	[production]
17:43	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance	[production]
17:43	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance	[production]
17:43	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2096 (T337310)', diff saved to https://phabricator.wikimedia.org/P52427 and previous config saved to /var/cache/conftool/dbconfig/20230911-174321-ladsgroup.json	[production]
17:28	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2096', diff saved to https://phabricator.wikimedia.org/P52426 and previous config saved to /var/cache/conftool/dbconfig/20230911-172815-ladsgroup.json	[production]
17:13	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2096', diff saved to https://phabricator.wikimedia.org/P52425 and previous config saved to /var/cache/conftool/dbconfig/20230911-171309-ladsgroup.json	[production]
17:09	<eevans@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1030.eqiad.wmnet with reason: host reimage	[production]
17:06	<eevans@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1030.eqiad.wmnet with reason: host reimage	[production]
16:59	<jclark@cumin1001>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kubernetes1027.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:59	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host kubernetes1029.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:58	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2096 (T337310)', diff saved to https://phabricator.wikimedia.org/P52424 and previous config saved to /var/cache/conftool/dbconfig/20230911-165802-ladsgroup.json	[production]
16:57	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host kubernetes1027.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:48	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1055.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:48	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1056.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:48	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1054.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:46	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db2096 (T337310)', diff saved to https://phabricator.wikimedia.org/P52423 and previous config saved to /var/cache/conftool/dbconfig/20230911-164249-ladsgroup.json	[production]
16:46	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2096.codfw.wmnet with reason: Maintenance	[production]
16:42	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db2096.codfw.wmnet with reason: Maintenance	[production]
16:41	<eevans@cumin1001>	START - Cookbook sre.hosts.reimage for host restbase1030.eqiad.wmnet with OS bullseye	[production]
16:32	<fnegri@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2005-dev.codfw.wmnet with reason: host reimage	[production]
16:31	<denisse@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host netmon2002.wikimedia.org with OS bookworm	[production]
16:28	<fnegri@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2005-dev.codfw.wmnet with reason: host reimage	[production]
16:18	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host kubernetes1056.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:17	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host kubernetes1054.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:17	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host kubernetes1055.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:16	<eevans@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1030.eqiad.wmnet with OS bullseye	[production]
16:12	<brouberol@cumin1001>	END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1152.eqiad.wmnet	[production]
16:10	<brouberol@cumin1001>	START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1152.eqiad.wmnet	[production]
16:10	<eevans@cumin1001>	START - Cookbook sre.hosts.reimage for host restbase1030.eqiad.wmnet with OS bullseye	[production]
16:08	<fnegri@cumin1001>	START - Cookbook sre.hosts.reimage for host cloudnet2005-dev.codfw.wmnet with OS bookworm	[production]
16:07	<brouberol@cumin1001>	END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1151.eqiad.wmnet	[production]
16:06	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1047.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:06	<denisse@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netmon2002.wikimedia.org with reason: host reimage	[production]
16:05	<brouberol@cumin1001>	START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1151.eqiad.wmnet	[production]
16:04	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1050.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:04	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1052.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:04	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1051.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:04	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1049.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:04	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1046.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:04	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1053.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:04	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1048.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:04	<jclark@cumin1001>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes1045.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
16:03	<denisse@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on netmon2002.wikimedia.org with reason: host reimage	[production]
16:01	<brouberol@cumin1001>	END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1150.eqiad.wmnet	[production]
16:00	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance	[production]
16:00	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance	[production]
15:59	<brouberol@cumin1001>	START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1150.eqiad.wmnet	[production]
15:48	<jclark@cumin1001>	START - Cookbook sre.hosts.provision for host kubernetes1047.mgmt.eqiad.wmnet with reboot policy FORCED	[production]