production SAL

9451-9500 of 10000 results (58ms)

2020-05-11 §
12:02	<jynus@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
11:59	<jynus@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
11:47	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
11:45	<cmjohnson@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
11:45	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
11:43	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
11:43	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
11:42	<cmjohnson@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
11:32	<Lucas_WMDE>	EU SWAT done	[production]
11:30	<lucaswerkmeister-wmde@deploy1001>	Synchronized php-1.35.0-wmf.30/extensions/WikimediaEvents/: SWAT: [[gerrit:594693\|Update Banner Interaction Schema (T250791, wmf.30)]] (duration: 01m 08s)	[production]
11:23	<lucaswerkmeister-wmde@deploy1001>	Synchronized php-1.35.0-wmf.31/extensions/WikimediaEvents/: SWAT: [[gerrit:594694\|Update Banner Interaction Schema (T250791, wmf.31)]] (duration: 01m 07s)	[production]
11:14	<kartik@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit\|595478\|Revert limit adjustment for Chinese translation with ContentTranslation (T252371)]] (duration: 01m 09s)	[production]
10:58	<jdrewniak@deploy1001>	Synchronized portals: Wikimedia Portals Update: [[gerrit:595498\| Bumping portals to master (595498)]] (duration: 01m 06s)	[production]
10:56	<jdrewniak@deploy1001>	Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:595498\| Bumping portals to master (595498)]] (duration: 01m 07s)	[production]
10:15	<vgutierrez>	upload trafficserver 8.0.7-1wm3 to apt.wm.o (buster) - T242767 T249335	[production]
09:44	<mutante>	contint2001 - find /var/lib/jenkins -user statsite -exec chown jenkins:jenkins {} \;	[production]
09:31	<hashar>	contint2001 started zuul-merger again (had permission issues in /var/lib/zuul )	[production]
09:07	<mutante>	contint1001 - rsync -avpz --delete /srv/jenkins/ rsync://contint2001.wikimedia.org/ci--srv-/jenkins/ (T224591)	[production]
09:05	<mutante>	contint2001 - mkdir /srv/jenkins	[production]
08:55	<hashar>	contint2001 stopping zuul-merger , permission problem	[production]
08:46	<godog>	bounce ferm on kubernetes1007 to resolve icinga UNKNOWN	[production]
08:40	<mutante>	rsyncing /var/lib/jenkins from contint1001 to contint2001 with --delete	[production]
08:32	<mutante>	rsynced data from contint1001 to contint2001 - pathes per T224591#6039192 for the migration later today	[production]
08:30	<ema>	cp3050: upgrade atskafka to 0.6 T237993	[production]
08:30	<_joe_>	removing the iptables DROP rule on mc1020 T251378	[production]
07:54	<moritzm>	installing squid security updates	[production]
07:21	<moritzm>	updated buster netboot images to 10.4 (updated to latest point release)	[production]
07:08	<_joe_>	dropping requests to mc1020 via a firewall rule T251378	[production]
06:04	<elukey>	restart wikimedia-discovery-golden on stat1007 - apparenlty killed by no memory left to allocate on the system	[production]
2020-05-10 §
12:18	<marostegui>	Start event scheduler on db1115 after a massive delete - T252324	[production]
11:05	<marostegui>	Stop event scheduler on db1115 to perform a massive delete - T252324	[production]
10:27	<dcausse>	restarting blazgraph on wdqs1004: T242453	[production]
09:56	<marostegui>	Change scaling_governor from powersave to performance on db1115 - T252324	[production]
09:25	<marostegui>	Stop MySQL and restart db1115 - T252324	[production]
08:50	<marostegui>	Restart mysql on db1115 to change buffer pool size from 20GB to 40GB T252324 (	[production]
08:44	<elukey>	Power cycle analytics1052 after eno1 issue	[production]
08:01	<marostegui>	Disable unused events like %_schema T252324 T231185	[production]
07:11	<marostegui>	Restart mysql on db1115 T231185	[production]
07:11	<marostegui>	Truncate tendril. processlist_query_log T231185	[production]
2020-05-08 §
21:45	<bstorm_>	cleaned up wb_terms_no_longer_updated view for testwikidatawiki and testcommonswiki on labsdb1010 T251598	[production]
21:45	<bstorm_>	cleaned up wb_terms_no_longer_updated view on labsdb1012 T251598	[production]
21:33	<bstorm_>	cleaning up wb_terms_no_longer_updated view on labsdb1009 T251598	[production]
21:06	<ottomata>	running prefered replica election for kafka-jumbo to get preferred leaders back after reboot of broker earlier today - T252203	[production]
19:16	<jhuneidi@deploy1001>	helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
19:12	<jhuneidi@deploy1001>	helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
19:07	<jhuneidi@deploy1001>	helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .	[production]
18:12	<andrewbogott>	reprepro copy buster-wikimedia stretch-wikimedia prometheus-openstack-exporter for T252121	[production]
17:59	<marostegui>	Extend /srv by 500G on labsdb1011 T249188	[production]
16:55	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
16:53	<pt1979@cumin2001>	START - Cookbook sre.hosts.downtime	[production]