production SAL

6501-6550 of 10000 results (37ms)

2020-05-08 §
21:33	<bstorm_>	cleaning up wb_terms_no_longer_updated view on labsdb1009 T251598	[production]
21:06	<ottomata>	running prefered replica election for kafka-jumbo to get preferred leaders back after reboot of broker earlier today - T252203	[production]
19:16	<jhuneidi@deploy1001>	helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
19:12	<jhuneidi@deploy1001>	helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
19:07	<jhuneidi@deploy1001>	helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .	[production]
18:12	<andrewbogott>	reprepro copy buster-wikimedia stretch-wikimedia prometheus-openstack-exporter for T252121	[production]
17:59	<marostegui>	Extend /srv by 500G on labsdb1011 T249188	[production]
16:55	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
16:53	<pt1979@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
16:51	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
16:48	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
16:39	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
16:37	<pt1979@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
16:14	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
16:12	<pt1979@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
15:43	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
15:41	<pt1979@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
15:36	<ottomata>	starting kafka broker on kafka-jumbo1006, same issue on other brokers when they are leaders of offending partitions - T252203	[production]
15:31	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
15:28	<pt1979@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
15:27	<ottomata>	stopping kafka broker on kafka-jumbo1006 to investigate camus import failures - T252203	[production]
14:50	<otto@deploy1001>	Finished deploy [analytics/refinery@4a2c530]: fix for camus wrapper, deploy to an-launcher1001 only (duration: 00m 03s)	[production]
14:50	<otto@deploy1001>	Started deploy [analytics/refinery@4a2c530]: fix for camus wrapper, deploy to an-launcher1001 only	[production]
14:05	<akosiaris>	T243106 undo experiment with DROP iptable rules this time around. Use mw1331, mw1348	[production]
13:22	<vgutierrez>	rolling restart of ats-tls on eqiad, codfw, ulsfo and eqsin - T249335	[production]
13:20	<akosiaris>	T243106 redo experiment with DROP iptable rules this time around. Use mw1331, mw1348	[production]
13:16	<akosiaris>	T243106 undo experiment with REJECT, DROP iptable rules now that we have envoy in the middle. Use mw1331, mw1348. Experiment done successfully, no issues to the infrastructure.	[production]
12:49	<akosiaris>	T243106 redo experiment with REJECT, DROP iptable rules now that we have envoy in the middle. Use mw1331, mw1348	[production]
12:49	<akosiaris>	T243106 redo experiment with REJECT, DROP iptable rules now that we have envoy in the middle	[production]
11:49	<hnowlan>	restarting cassandra on restbase2009 for java updates	[production]
11:28	<cmjohnson@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
11:25	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
11:08	<akosiaris>	repool eqiad eventgate-analytics. Test concluded	[production]
11:08	<akosiaris@cumin1001>	conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=eventgate-analytics	[production]
09:54	<mutante>	disabling puppet on puppetmasters temporarily to switch them carefully to use httpd module and not apache module which we want to get rid of	[production]
09:52	<akosiaris>	depool eqiad eventgate-analytics for a test involving reinitializing the eqiad kubernetes cluster	[production]
09:52	<akosiaris@cumin1001>	conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=eventgate-analytics	[production]
09:51	<akosiaris@cumin1001>	conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=eventgate-analytics	[production]
09:45	<oblivian@puppetmaster1001>	conftool action : set/ttl=10; selector: dnsdisc=eventgate-analytics.*	[production]
08:20	<vgutierrez>	rolling restart of ats-tls on esams - T249335	[production]
07:19	<vgutierrez>	ats-tls restart on cp3050 and cp3052 (max_connections_active_in experiment) - T249335	[production]
07:07	<mutante>	phabricator rmdir /var/run/phd/pid - empty and now unused	[production]
07:01	<moritzm>	installing php5 security updates	[production]
05:27	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
05:24	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
05:10	<marostegui>	Upgrade pc1010	[production]
00:30	<brennen@deploy1001>	rebuilt and synchronized wikiversions files: Revert all wikis except test to 1.35.0-wmf.30 for T252179	[production]
00:19	<brennen>	rolling 1.35.0-wmf.31 train back to group0 for T252179	[production]
2020-05-07 §
22:36	<brennen@deploy1001>	rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.31	[production]
22:31	<brennen@deploy1001>	Synchronized php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaCommon/TitleLibrary.php: [[gerrit:595054\|Handle RevisionAccessException with try-catch (T252156)]] (duration: 01m 08s)	[production]