tools SAL

351-400 of 1347 results (15ms)

2018-01-05 §
13:46	<andrewbogott>	moving tools-webgrid-lighttpd-1419.tools.eqiad.wmflabs to labvirt1017 (CPU balancing)	[tools]
05:33	<andrewbogott>	migrating tools-worker-1012 to labvirt1017 (CPU load balancing)	[tools]
2018-01-04 §
17:24	<andrewbogott>	rebooting tools-paws-worker-1019 to verify repair of T184018	[tools]
2018-01-03 §
15:38	<bd808>	Forced Puppet run on tools-services-01	[tools]
11:29	<arturo>	deploy https://gerrit.wikimedia.org/r/#/c/401716/ and https://gerrit.wikimedia.org/r/394101 using clush	[tools]
2017-12-31 §
02:00	<bd808>	Killed some pwb.py and qacct processes running on tools-bastion-03	[tools]
2017-12-21 §
17:57	<bd808>	PAWS: deleted hub-deployment pod stuck in crashloopbackoff	[tools]
17:30	<bd808>	PAWS: deleting hub-deployment pod. Lots of "Connection pool is full" warnings in pod logs	[tools]
2017-12-19 §
21:27	<chasemp>	reboot tools-paws-master-01	[tools]
18:38	<andrewbogott>	rebooting tools-paws-master-01	[tools]
05:07	<andrewbogott>	"service gridengine-master restart" on tools-grid-master	[tools]
2017-12-18 §
12:04	<arturo>	it seems jupyterhub tries to use a database which doesn't exists: [E 2017-12-18 11:59:49.896 JupyterHub app:904] Failed to connect to db: sqlite:///jupyterhub.sqlite	[tools]
11:58	<arturo>	The restart didn't work. I could see a lot of log lines in the hub-deployment pod with something like: 2017-12-17 04:08:17,574 WARNING Connection pool is full, discarding connection: 10.96.0.1	[tools]
11:51	<arturo>	the restart was with: kubectl get pod -o yaml hub-deployment-1381799904-b5g5j -n prod \| kubectl replace --force -f -	[tools]
11:50	<arturo>	restart pod hub-deployment in paws to try to fix the 502	[tools]
2017-12-15 §
13:55	<arturo>	same in tools-checker-02.tools.eqiad.wmflabs	[tools]
13:54	<arturo>	same in tools-exec-1415.tools.eqiad.wmflabs	[tools]
13:52	<arturo>	running 'sudo puppet agent -t -v' in tools-webgrid-lighttpd-1416.tools.eqiad.wmflabs since didn't update in the last run with clush	[tools]
2017-12-14 §
16:58	<arturo>	running clush -w @all 'sudo puppet agent --test' from tools-clushmaster-01.eqiad.wmflabs due to https://gerrit.wikimedia.org/r/#/c/394572/ being merged	[tools]
2017-12-13 §
17:37	<andrewbogott>	upgrading puppet packages on all VMs	[tools]
00:59	<madhuvishy>	Cordon and Drain tools-worker-1016	[tools]
00:47	<madhuvishy>	Drain + Cordon, Reboot, Uncordon tools-workers-1018-1023, 1025-1027	[tools]
00:34	<madhuvishy>	Drain + Cordon, Reboot, Uncordon tools-workers-1011, 1013-1015, 1017	[tools]
00:28	<madhuvishy>	Drain + Cordon, Reboot, Uncordon tools-workers-1006-1010	[tools]
00:11	<madhuvishy>	Drain + Cordon, Reboot, Uncordon tools-workers-1002-1005	[tools]
2017-12-12 §
23:29	<madhuvishy>	rebooting tools-worker-1012	[tools]
18:50	<andrewbogott>	rebooting tools-worker-1001	[tools]
2017-12-11 §
19:32	<bd808>	git gc on tools-static-11; --aggressive was killed by system (T182604)	[tools]
18:07	<andrewbogott>	upgrading tools puppetmaster to v4	[tools]
17:07	<bd808>	git gc --aggressive on tools-static-11 (T182604)	[tools]
2017-12-01 §
15:33	<chasemp>	put the weird mess of untracked files on tools puppetmaster into stash to see what breaks as they should not be there?	[tools]
15:30	<chasemp>	prometheus nfs collector on tools-bastion-03	[tools]
2017-11-30 §
23:23	<bd808>	Hard reboot of tools-bastion-03 via Horizon	[tools]
23:06	<chasemp>	rebooting login.tools.wmflabs.org due to overload	[tools]
2017-11-20 §
20:34	<chasemp>	backup crons tools-cron-01:/var/spool/cron# cp -Rp crontabs/ /root/20112017/	[tools]
00:52	<andrewbogott>	cherry-picking https://gerrit.wikimedia.org/r/#/c/392172/ onto the tools puppetmaster	[tools]
2017-11-17 §
21:33	<valhallasw`cloud>	also g-w'ed those files, and sent emails to all the affected users	[tools]
21:17	<valhallasw`cloud>	chmod o-w'ed a bunch of files reported by Dispenser; writing emails to the owners about this	[tools]
2017-11-16 §
17:40	<chasemp>	tools-clushmaster-01:~$ clush -w @all 'sudo puppet agent --enable && sudo puppet agent --test && sudo unattended-upgrades -d'	[tools]
16:50	<bd808>	Force upgraded nginx on tools-elastic-*	[tools]
16:37	<chasemp>	reboot tools-checker-01	[tools]
15:17	<chasemp>	disable puppet	[tools]
2017-11-15 §
22:48	<madhuvishy>	Rebooted tools-paws-worker-1017	[tools]
15:53	<chasemp>	reboot bastion-03	[tools]
15:48	<chasemp>	kill tools.powow on bastion-03 for hammering IO and making bastion unusable	[tools]
2017-11-07 §
01:21	<bd808>	Removed all non-directory files from /home (via labstore1004 direct access)	[tools]
2017-11-06 §
18:30	<bd808>	Load on tools-bastion-03 down to 0.72 from 17.47 after killing a bunch of local processes that should have been running on the job grid instead	[tools]
2017-11-05 §
23:48	<bd808>	Cleaned up 2 huge /tmp files left by tools.croptool (~6.5G)	[tools]
23:44	<bd808>	Cleaned up 109 files owned by tools.rezabot on tools-webgrid-lighttpd-1428 with `sudo find /tmp -user tools.rezabot -exec rm {} \+`	[tools]
23:37	<bd808>	Cleaned up 955 files owned by tools.wsexport on tools-webgrid-lighttpd-1428 with `sudo find /tmp -user tools.wsexport -exec rm {} \+`	[tools]