production SAL

1451-1500 of 10000 results (112ms)

2025-04-17 §
20:06	<bking@cumin2002>	END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch2058	[production]
20:06	<bking@cumin2002>	START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch2058	[production]
20:06	<bking@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
20:06	<bking@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic2058 to cirrussearch2058 - bking@cumin2002"	[production]
20:05	<bking@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic2058 to cirrussearch2058 - bking@cumin2002"	[production]
20:02	<vriley@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1181.eqiad.wmnet with reason: host reimage	[production]
20:00	<bking@cumin2002>	START - Cookbook sre.dns.netbox	[production]
20:00	<bking@cumin2002>	START - Cookbook sre.hosts.rename from elastic2058 to cirrussearch2058	[production]
20:00	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1248 (T391056)', diff saved to https://phabricator.wikimedia.org/P75230 and previous config saved to /var/cache/conftool/dbconfig/20250417-200008-fceratto.json	[production]
19:59	<bking@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row B - bking@cumin2002 - T388610	[production]
19:59	<bking@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row B - bking@cumin2002 - T388610	[production]
19:58	<bking@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row B - bking@cumin2002 - T388610	[production]
19:58	<vriley@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1181.eqiad.wmnet with reason: host reimage	[production]
19:55	<fceratto@cumin1002>	dbctl commit (dc=all): 'Depooling db1248 (T391056)', diff saved to https://phabricator.wikimedia.org/P75229 and previous config saved to /var/cache/conftool/dbconfig/20250417-195506-fceratto.json	[production]
19:54	<fceratto@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1248.eqiad.wmnet with reason: Maintenance	[production]
19:54	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1247 (T391056)', diff saved to https://phabricator.wikimedia.org/P75228 and previous config saved to /var/cache/conftool/dbconfig/20250417-195442-fceratto.json	[production]
19:50	<vriley@cumin1002>	START - Cookbook sre.hosts.provision for host an-worker1178.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL	[production]
19:50	<vriley@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1178.eqiad.wmnet with OS bullseye	[production]
19:44	<vriley@cumin1002>	START - Cookbook sre.hosts.reimage for host an-worker1181.eqiad.wmnet with OS bullseye	[production]
19:43	<vriley@cumin1002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host an-worker1181.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL	[production]
19:42	<vriley@cumin1002>	START - Cookbook sre.hosts.provision for host an-worker1181.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL	[production]
19:39	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P75226 and previous config saved to /var/cache/conftool/dbconfig/20250417-193935-fceratto.json	[production]
19:36	<vriley@cumin1002>	START - Cookbook sre.hosts.reimage for host an-worker1178.eqiad.wmnet with OS bullseye	[production]
19:35	<vriley@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1178.eqiad.wmnet with OS bullseye	[production]
19:24	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P75225 and previous config saved to /var/cache/conftool/dbconfig/20250417-192430-fceratto.json	[production]
19:22	<vriley@cumin1002>	START - Cookbook sre.hosts.reimage for host an-worker1178.eqiad.wmnet with OS bullseye	[production]
19:21	<vriley@cumin1002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host an-worker1178.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL	[production]
19:14	<vriley@cumin1002>	START - Cookbook sre.hosts.provision for host an-worker1178.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL	[production]
19:09	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1247 (T391056)', diff saved to https://phabricator.wikimedia.org/P75223 and previous config saved to /var/cache/conftool/dbconfig/20250417-190923-fceratto.json	[production]
19:03	<fceratto@cumin1002>	dbctl commit (dc=all): 'Depooling db1247 (T391056)', diff saved to https://phabricator.wikimedia.org/P75222 and previous config saved to /var/cache/conftool/dbconfig/20250417-190331-fceratto.json	[production]
19:03	<fceratto@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1247.eqiad.wmnet with reason: Maintenance	[production]
18:59	<fceratto@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1245.eqiad.wmnet with reason: Maintenance	[production]
18:59	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1243 (T391056)', diff saved to https://phabricator.wikimedia.org/P75221 and previous config saved to /var/cache/conftool/dbconfig/20250417-185930-fceratto.json	[production]
18:44	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P75219 and previous config saved to /var/cache/conftool/dbconfig/20250417-184423-fceratto.json	[production]
18:29	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P75218 and previous config saved to /var/cache/conftool/dbconfig/20250417-182916-fceratto.json	[production]
18:14	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1243 (T391056)', diff saved to https://phabricator.wikimedia.org/P75217 and previous config saved to /var/cache/conftool/dbconfig/20250417-181408-fceratto.json	[production]
18:13	<dduvall@deploy1003>	rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.25 refs T386220	[production]
17:56	<fceratto@cumin1002>	dbctl commit (dc=all): 'Depooling db1243 (T391056)', diff saved to https://phabricator.wikimedia.org/P75216 and previous config saved to /var/cache/conftool/dbconfig/20250417-175614-fceratto.json	[production]
17:56	<fceratto@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1243.eqiad.wmnet with reason: Maintenance	[production]
17:55	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1242 (T391056)', diff saved to https://phabricator.wikimedia.org/P75215 and previous config saved to /var/cache/conftool/dbconfig/20250417-175552-fceratto.json	[production]
17:40	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P75214 and previous config saved to /var/cache/conftool/dbconfig/20250417-174046-fceratto.json	[production]
17:25	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P75213 and previous config saved to /var/cache/conftool/dbconfig/20250417-172539-fceratto.json	[production]
17:20	<mutante>	idp-test2005 - 100% disk space used - alerting since over 6 days (is there a point in alerts for test hosts?) - apt-get clean .. brought it back to 94%	[production]
17:12	<bd808@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply	[production]
17:11	<bd808@deploy1003>	helmfile [eqiad] START helmfile.d/services/developer-portal: apply	[production]
17:10	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1242 (T391056)', diff saved to https://phabricator.wikimedia.org/P75212 and previous config saved to /var/cache/conftool/dbconfig/20250417-171032-fceratto.json	[production]
17:09	<bd808@deploy1003>	helmfile [codfw] DONE helmfile.d/services/developer-portal: apply	[production]
17:09	<bd808@deploy1003>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
17:09	<bd808@deploy1003>	helmfile [staging] DONE helmfile.d/services/developer-portal: apply	[production]
17:08	<bd808@deploy1003>	helmfile [staging] START helmfile.d/services/developer-portal: apply	[production]