production SAL

5601-5650 of 10000 results (93ms)

2023-11-15 §
22:57	<bking@cumin2002>	START - Cookbook sre.puppet.renew-cert for cloudelastic1008.wikimedia.org: Renew puppet certificate - bking@cumin2002	[production]
22:41	<ryankemper@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudelastic[1007-1010].wikimedia.org with reason: new cloudelastic hosts TT351354	[production]
22:41	<ryankemper@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudelastic[1007-1010].wikimedia.org with reason: new cloudelastic hosts TT351354	[production]
22:20	<ryankemper>	T351354 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/974693; running puppet on hosts	[production]
19:39	<topranks>	re-enabling puppet on DNS hosts to adjust TTL setting in BIRD (T350488)	[production]
19:37	<bking@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1010.wikimedia.org with OS bullseye	[production]
19:36	<bking@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1009.wikimedia.org with OS bullseye	[production]
19:34	<bking@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1008.wikimedia.org with OS bullseye	[production]
19:23	<jhuneidi@deploy2002>	Synchronized php: group1 wikis to 1.42.0-wmf.5 refs T350081 (duration: 05m 52s)	[production]
19:17	<jhuneidi@deploy2002>	rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.5 refs T350081	[production]
19:15	<dzahn@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: aphlict	[production]
19:10	<topranks>	merging patch to remove TTL restriction on Bird Anycast BGP peerings (T350488)	[production]
19:09	<dzahn@cumin1001>	START - Cookbook sre.puppet.migrate-role for role: aphlict	[production]
19:07	<taavi@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudlb2001-dev.codfw.wmnet	[production]
19:07	<mutante>	aphlict2001 - restart aphlict service after puppet 7 upgrade	[production]
19:05	<jbond@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::codfw1dev::virt_ceph	[production]
19:01	<taavi@cumin1001>	START - Cookbook sre.puppet.migrate-host for host cloudlb2001-dev.codfw.wmnet	[production]
19:00	<taavi@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudgw2003-dev.codfw.wmnet	[production]
18:59	<jbond@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::codfw1dev::services	[production]
18:59	<dzahn@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host aphlict2001.codfw.wmnet	[production]
18:59	<jbond@cumin1001>	START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::codfw1dev::virt_ceph	[production]
18:58	<jbond@cumin1001>	END (FAIL) - Cookbook sre.puppet.migrate-role (exit_code=99) for role: wmcs::openstack::codfw1dev::virt_ceph	[production]
18:56	<cmooney@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS bullseye	[production]
18:54	<jbond@cumin1001>	START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::codfw1dev::virt_ceph	[production]
18:54	<jbond@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::codfw1dev::net	[production]
18:54	<dzahn@cumin1001>	START - Cookbook sre.puppet.migrate-host for host aphlict2001.codfw.wmnet	[production]
18:54	<taavi@cumin1001>	START - Cookbook sre.puppet.migrate-host for host cloudgw2003-dev.codfw.wmnet	[production]
18:51	<jbond@cumin1001>	START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::codfw1dev::services	[production]
18:49	<taavi@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudgw2002-dev.codfw.wmnet	[production]
18:45	<jbond@cumin1001>	START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::codfw1dev::net	[production]
18:42	<topranks>	Reset BGP to lvs4010 from cr3-ulsfo to validate new config T350488	[production]
18:41	<taavi@cumin1001>	START - Cookbook sre.puppet.migrate-host for host cloudgw2002-dev.codfw.wmnet	[production]
18:36	<topranks>	remove TTL setting on server-facing BGP peerings on cr3-ulsfo T350488	[production]
18:25	<jbond@cumin1001>	END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::codfw1dev::db	[production]
18:16	<bking@cumin2002>	START - Cookbook sre.hosts.reimage for host cloudelastic1010.wikimedia.org with OS bullseye	[production]
18:15	<bking@cumin2002>	START - Cookbook sre.hosts.reimage for host cloudelastic1009.wikimedia.org with OS bullseye	[production]
18:14	<jbond@cumin1001>	START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::codfw1dev::db	[production]
18:12	<bking@cumin2002>	START - Cookbook sre.hosts.reimage for host cloudelastic1008.wikimedia.org with OS bullseye	[production]
18:05	<arnaudb@cumin1001>	dbctl commit (dc=all): 'Depooling db1141 (T348183)', diff saved to https://phabricator.wikimedia.org/P53488 and previous config saved to /var/cache/conftool/dbconfig/20231115-180503-arnaudb.json	[production]
18:04	<arnaudb@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance	[production]
18:04	<arnaudb@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance	[production]
18:01	<jynus>	All restart_daemons were successful	[production]
18:01	<root@cumin2002>	END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-codfw	[production]
17:57	<bking@cumin1001>	START - Cookbook sre.wdqs.data-reload	[production]
17:57	<bking@cumin1001>	END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)	[production]
17:56	<bking@cumin1001>	START - Cookbook sre.wdqs.data-reload	[production]
17:56	<root@cumin2002>	START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-codfw	[production]
17:52	<inflatador>	bking@wdqs1024 reboot host to hopefully reduce data reload failures T349011	[production]
17:51	<bking@cumin1001>	END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)	[production]
17:29	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1019,lvs2013} and A:lvs (T349796)	[production]