__all__ SAL

1901-1950 of 10000 results (51ms)

2018-03-06 §
15:03	<chasemp>	rebooted tools-worker 1001-1008	[tools]
15:02	<hashar@tin>	Synchronized wmf-config/InitialiseSettings.php: Article counts: Change 'comma' method to 'any' - T188472 (duration: 01m 00s)	[production]
14:58	<arturo>	drain and reboot tools-worker-1010	[tools]
14:50	<vgutierrez>	update pybal to 1.15.0 on lvs1010	[production]
14:46	<hashar>	tin: /srv/mediawiki-staging/php-1.31.0-wmf.23 rebased on tip of https://gerrit.wikimedia.org/r/#/c/416686/ (that revert a merge of master branch)	[production]
14:42	<gehel>	rebooting maps1* (eqiad) for kernel security update completed	[production]
14:37	<chasemp>	@tools-bastion-03:~$ webservice restart --backend=kubernetes	[tools.replag]
14:36	<ottomata>	beginning migration of webrequest text varnishkafka logs from Kafka analytics to Kafka jumbo-eqiad T185136	[production]
14:27	<chasemp>	multiple tools running on k8s workers report issues reading replica.my.cnf file atm	[tools]
14:27	<chasemp>	reboot tools-worker-100[12]	[tools]
14:23	<chasemp>	downtime icinga alert for k8s workers ready	[tools]
14:21	<moritzm>	rebooting labweb* for kernel security update	[production]
14:13	<moritzm>	rebooting sca* for kernel security update	[production]
14:07	<gehel>	rebooting maps1* (eqiad) for kernel security update	[production]
14:07	<moritzm>	rebooting pybal-test for kernel security update	[production]
14:00	<_joe_>	SWAT is suspended for investigation on tin's git status	[production]
14:00	<moritzm>	rebooting oxygen for kernel security update	[production]
13:21	<arturo>	T188994 in some servers there was some race in the dpkg lock between apt-upgrade and puppet. Also, I forgot to use DEBIAN_FRONTEND=noninteractive, so debconf prompts happened and stalled dpkg operations. Already solved, but some puppet alerts were produced	[tools]
13:16	<moritzm>	powercycling ms-be1038, stuck after reboot	[production]
13:10	<marostegui>	Deploy schema change on db1094 - T187089 T185128 T153182	[production]
13:09	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1094 for alter table (duration: 00m 58s)	[production]
12:58	<arturo>	T188994 upgrading packages in jessie nodes from the oldstable source	[tools]
12:55	<moritzm>	rebooting URL downloaders for kernel security update	[production]
12:51	<mobrovac@tin>	Finished deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052 (duration: 00m 34s)	[production]
12:50	<mobrovac@tin>	Started deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052	[production]
12:43	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Repool db1086 after alter table (duration: 00m 58s)	[production]
12:33	<moritzm>	rebooting mwlog* for kernel security update	[production]
12:04	<moritzm>	rebooting graphite hosts in eqiad for kernel security update	[production]
11:42	<arturo>	clush -w @all "sudo DEBIAN_FRONTEND=noninteractive apt-get autoclean" <-- free space in filesystem	[tools]
11:41	<arturo>	aborrero@tools-clushmaster-01:~$ clush -w @all "sudo DEBIAN_FRONTEND=noninteractive apt-get autoremove -y" <-- we did in canary servers last week and it went fine. So run in fleet-wide	[tools]
11:36	<arturo>	(ubuntu) removed linux-image-3.13.0-142-generic and linux-image-3.13.0-137-generic (T188911)	[tools]
11:33	<arturo>	removing unused kernel packages in ubuntu nodes	[tools]
11:29	<moritzm>	rebooting k8s masters for kernel security update	[production]
11:08	<arturo>	aborrero@tools-clushmaster-01:~$ clush -w @all "sudo rm /etc/apt/preferences.d/* ; sudo puppet agent -t -v" <--- rebuild directory, it contains stale files across all the cluster	[tools]
11:05	<elukey>	reboot analytics10[28,35,52] for kernel updates (one at the time, hadoop hdfs journal nodes)	[production]
10:46	<moritzm>	powercycling ms-be1021, stuck after reboot	[production]
10:45	<akosiaris@tin>	Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 01m 22s)	[production]
10:43	<moritzm>	rearming keyholder on naos after reboot	[production]
10:39	<akosiaris>	emergency add a captcha in metawiki contact pages like https://meta.wikimedia.org/wiki/Special:Contact/Stewards to stop bot abuse. phab Task to be filed later on	[production]
10:39	<godog>	reboot ms-be1013 to try fix disk ordering	[production]
10:35	<moritzm>	rebooting naos for kernel security update	[production]
10:32	<moritzm>	rearming keyholder on tin after reboot	[production]
10:30	<gehel>	kafka poller active on all production wdqs nodes - T188252	[production]
10:28	<moritzm>	rebooting tin for kernel security update	[production]
10:20	<gehel>	reboot completed for maps2* and maps-test*	[production]
10:19	<elukey>	restart webrequest-load-wf-upload-2018-3-6-7 (failed due to reboots)	[analytics]
10:08	<elukey>	re-starting mysql consumers on eventlog1001	[analytics]
09:51	<moritzm>	rebooting graphite hosts in codfw for kernel security update	[production]
09:42	<marostegui>	Stop MySQL on db1107 for mariadb and kernel upgrade	[production]
09:41	<vgutierrez>	pybal_1.15.0_all.deb to apt.wikimedia.org jessie-wikimedia	[production]