production SAL

451-500 of 10000 results (29ms)

2020-11-16 §
22:19	<otto@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' .	[production]
22:17	<otto@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .	[production]
22:09	<otto@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .	[production]
22:06	<mutante>	planet - fixed updates of uk.planet which failed due to non-ASCII chars in a URL - since updates are systemd timers now that affects the entire systemd state monitoring	[production]
21:40	<rzl@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2250.codfw.wmnet	[production]
21:40	<rzl@cumin1001>	conftool action : set/weight=1; selector: name=mw2250.codfw.wmnet,cluster=videoscaler,service=canary	[production]
21:38	<rzl@cumin1001>	conftool action : set/pooled=yes; selector: name=mw2250.codfw.wmnet,cluster=jobrunner	[production]
21:30	<mutante>	peek2001 - mv /var/lib/peek/git to git.old ; run puppet ; let it fix git checkout	[production]
21:07	<rzl>	disable puppet on jobrunners T264991	[production]
20:40	<mutante>	planet1002/planet2002 - delete entire crontab of user planet, drop update cronjobs after switching to systemd timers with gerrit:636105 (T265138)	[production]
20:06	<pt1979@cumin2001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
20:06	<mutante>	releases2002 systemctl reset-failed should clear Icinga systemd alert after gerrit:641228	[production]
20:05	<dwisehaupt>	disabling process-control jobs and moving to maintenance mode for maint window	[production]
19:57	<pt1979@cumin2001>	START - Cookbook sre.dns.netbox	[production]
19:53	<ebernhardson@deploy1001>	Finished deploy [wikimedia/discovery/analytics@4a953ca]: query_clicks_hourly: handle wmf.webrequest page_id change from int to bigint (duration: 02m 27s)	[production]
19:51	<ebernhardson@deploy1001>	Started deploy [wikimedia/discovery/analytics@4a953ca]: query_clicks_hourly: handle wmf.webrequest page_id change from int to bigint	[production]
19:48	<effie>	disable puppet on parsoid servers - T264991	[production]
19:01	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)	[production]
18:59	<mutante>	mw2255 - is pooled and puppet works on next run, after it removed php 7.2 config files	[production]
18:56	<mutante>	running puppet on mw2313 and mw2255 which were listed in puppetboard as failed puppet runs	[production]
18:15	<rzl>	disable puppet on 'A:mw-api and not A:mw-api-canary' T264991	[production]
18:05	<effie>	disable puppet on all appservers	[production]
17:48	<elukey>	enable and run puppet on kafka-main2003 (it will start kafka services) - T267865	[production]
17:42	<dwisehaupt>	frmon1001 upgraded to buster	[production]
17:36	<volans>	moved interfaces in Netbox from old to new switch - T267865	[production]
17:24	<vgutierrez>	switching back from lvs2010 to lvs2007 - T267865	[production]
17:21	<vgutierrez>	repooling cp2037 and cp2038 - T267865	[production]
16:46	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)	[production]
16:40	<elukey@cumin1001>	START - Cookbook sre.hosts.decommission	[production]
16:16	<XioNoX>	update c7 serial in row C VC config - T267865	[production]
16:16	<rzl>	disable puppet on A:mw-api-canary T264991	[production]
16:14	<hnowlan@cumin1001>	START - Cookbook sre.cassandra.roll-restart	[production]
16:08	<effie>	disable puppet in appservers canaries to install ICU 63 - T264991	[production]
16:07	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet	[production]
16:07	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=no; selector: name=cp2037.codfw.wmnet	[production]
16:06	<hnowlan@cumin1001>	END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99)	[production]
16:03	<hnowlan>	joined maps2006 to maps codfw cassandra cluster	[production]
16:01	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)	[production]
15:57	<hnowlan@cumin1001>	START - Cookbook sre.cassandra.roll-restart	[production]
15:57	<hnowlan>	roll-restarting eqiad restbase for java security updates	[production]
15:56	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)	[production]
15:50	<elukey@cumin1001>	START - Cookbook sre.hosts.decommission	[production]
15:40	<cdanis@cumin1001>	END (PASS) - Cookbook sre.network.cf (exit_code=0)	[production]
15:40	<cdanis@cumin1001>	START - Cookbook sre.network.cf	[production]
14:16	<hnowlan@cumin1001>	START - Cookbook sre.cassandra.roll-restart	[production]
14:12	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Repool pc1007 in pc1 after restarting mysql T266483 (duration: 00m 59s)	[production]
14:06	<marostegui>	Restart pc1007's mysql T266483	[production]
14:06	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Depool pc1007 and place pc1010 instead of it T266483 (duration: 01m 00s)	[production]
13:23	<hnowlan@cumin1001>	END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99)	[production]
13:00	<kormat>	running schema change against s1 in codfw T259831	[production]