tools SAL

3001-3050 of 4006 results (28ms)

2018-01-05 §
23:49	<madhuvishy>	Run clush -w @k8s-worker -x tools-worker-1001.tools.eqiad.wmflabs 'sudo service docker restart; sudo service flannel restart; sudo service kubelet restart; sudo service kube-proxy restart' on tools-clushmaster-01	[tools]
16:22	<andrewbogott>	moving tools-worker-1027 to labvirt1015 (CPU balancing)	[tools]
16:01	<andrewbogott>	moving tools-worker-1017 to labvirt1017 (CPU balancing)	[tools]
15:32	<andrewbogott>	moving tools-exec-1420.tools.eqiad.wmflabs to labvirt1015 (CPU balancing)	[tools]
15:18	<andrewbogott>	moving tools-exec-1411.tools.eqiad.wmflabs to labvirt1017 (CPU balancing)	[tools]
15:02	<andrewbogott>	moving tools-exec-1440.tools.eqiad.wmflabs to labvirt1017 (CPU balancing)	[tools]
14:47	<andrewbogott>	moving tools-webgrid-lighttpd-1421.tools.eqiad.wmflabs to labvirt1017 (CPU balancing)	[tools]
14:25	<andrewbogott>	moving tools-webgrid-lighttpd-1420.tools.eqiad.wmflabs to labvirt1015 (CPU balancing)	[tools]
14:05	<andrewbogott>	moving tools-webgrid-lighttpd-1417.tools.eqiad.wmflabs to labvirt1015 (CPU balancing)	[tools]
13:46	<andrewbogott>	moving tools-webgrid-lighttpd-1419.tools.eqiad.wmflabs to labvirt1017 (CPU balancing)	[tools]
05:33	<andrewbogott>	migrating tools-worker-1012 to labvirt1017 (CPU load balancing)	[tools]
2018-01-04 §
17:24	<andrewbogott>	rebooting tools-paws-worker-1019 to verify repair of T184018	[tools]
2018-01-03 §
15:38	<bd808>	Forced Puppet run on tools-services-01	[tools]
11:29	<arturo>	deploy https://gerrit.wikimedia.org/r/#/c/401716/ and https://gerrit.wikimedia.org/r/394101 using clush	[tools]
2017-12-31 §
02:00	<bd808>	Killed some pwb.py and qacct processes running on tools-bastion-03	[tools]
2017-12-21 §
17:57	<bd808>	PAWS: deleted hub-deployment pod stuck in crashloopbackoff	[tools]
17:30	<bd808>	PAWS: deleting hub-deployment pod. Lots of "Connection pool is full" warnings in pod logs	[tools]
2017-12-19 §
21:27	<chasemp>	reboot tools-paws-master-01	[tools]
18:38	<andrewbogott>	rebooting tools-paws-master-01	[tools]
05:07	<andrewbogott>	"service gridengine-master restart" on tools-grid-master	[tools]
2017-12-18 §
12:04	<arturo>	it seems jupyterhub tries to use a database which doesn't exists: [E 2017-12-18 11:59:49.896 JupyterHub app:904] Failed to connect to db: sqlite:///jupyterhub.sqlite	[tools]
11:58	<arturo>	The restart didn't work. I could see a lot of log lines in the hub-deployment pod with something like: 2017-12-17 04:08:17,574 WARNING Connection pool is full, discarding connection: 10.96.0.1	[tools]
11:51	<arturo>	the restart was with: kubectl get pod -o yaml hub-deployment-1381799904-b5g5j -n prod \| kubectl replace --force -f -	[tools]
11:50	<arturo>	restart pod hub-deployment in paws to try to fix the 502	[tools]
2017-12-15 §
13:55	<arturo>	same in tools-checker-02.tools.eqiad.wmflabs	[tools]
13:54	<arturo>	same in tools-exec-1415.tools.eqiad.wmflabs	[tools]
13:52	<arturo>	running 'sudo puppet agent -t -v' in tools-webgrid-lighttpd-1416.tools.eqiad.wmflabs since didn't update in the last run with clush	[tools]
2017-12-14 §
16:58	<arturo>	running clush -w @all 'sudo puppet agent --test' from tools-clushmaster-01.eqiad.wmflabs due to https://gerrit.wikimedia.org/r/#/c/394572/ being merged	[tools]
2017-12-13 §
17:37	<andrewbogott>	upgrading puppet packages on all VMs	[tools]
00:59	<madhuvishy>	Cordon and Drain tools-worker-1016	[tools]
00:47	<madhuvishy>	Drain + Cordon, Reboot, Uncordon tools-workers-1018-1023, 1025-1027	[tools]
00:34	<madhuvishy>	Drain + Cordon, Reboot, Uncordon tools-workers-1011, 1013-1015, 1017	[tools]
00:28	<madhuvishy>	Drain + Cordon, Reboot, Uncordon tools-workers-1006-1010	[tools]
00:11	<madhuvishy>	Drain + Cordon, Reboot, Uncordon tools-workers-1002-1005	[tools]
2017-12-12 §
23:29	<madhuvishy>	rebooting tools-worker-1012	[tools]
18:50	<andrewbogott>	rebooting tools-worker-1001	[tools]
2017-12-11 §
19:32	<bd808>	git gc on tools-static-11; --aggressive was killed by system (T182604)	[tools]
18:07	<andrewbogott>	upgrading tools puppetmaster to v4	[tools]
17:07	<bd808>	git gc --aggressive on tools-static-11 (T182604)	[tools]
2017-12-01 §
15:33	<chasemp>	put the weird mess of untracked files on tools puppetmaster into stash to see what breaks as they should not be there?	[tools]
15:30	<chasemp>	prometheus nfs collector on tools-bastion-03	[tools]
2017-11-30 §
23:23	<bd808>	Hard reboot of tools-bastion-03 via Horizon	[tools]
23:06	<chasemp>	rebooting login.tools.wmflabs.org due to overload	[tools]
2017-11-20 §
20:34	<chasemp>	backup crons tools-cron-01:/var/spool/cron# cp -Rp crontabs/ /root/20112017/	[tools]
00:52	<andrewbogott>	cherry-picking https://gerrit.wikimedia.org/r/#/c/392172/ onto the tools puppetmaster	[tools]
2017-11-17 §
21:33	<valhallasw`cloud>	also g-w'ed those files, and sent emails to all the affected users	[tools]
21:17	<valhallasw`cloud>	chmod o-w'ed a bunch of files reported by Dispenser; writing emails to the owners about this	[tools]
2017-11-16 §
17:40	<chasemp>	tools-clushmaster-01:~$ clush -w @all 'sudo puppet agent --enable && sudo puppet agent --test && sudo unattended-upgrades -d'	[tools]
16:50	<bd808>	Force upgraded nginx on tools-elastic-*	[tools]
16:37	<chasemp>	reboot tools-checker-01	[tools]