tools SAL

701-750 of 3613 results (25ms)

2021-03-04 §
11:25	<arturo>	rebooted tools-sgewebgrid-generic-0901, repool it again	[tools]
09:57	<arturo>	depool tools-sgewebgrid-generic-0901 to reboot VM. It was stuck in MIGRATING state when draining cloudvirt1022	[tools]
2021-03-03 §
15:17	<arturo>	shutting down tools-sgebastion-07 in an attempt to fix nova state and finish hypervisor migration	[tools]
15:11	<arturo>	tools-sgebastion-07 triggered a neutron exception (unauthorized) while being live-migrated from cloudvirt1021 to 1029. Resetting nova state with `nova reset-state bd685d48-1011-404e-a755-372f6022f345 --active` and try again	[tools]
14:48	<arturo>	killed pywikibot instance running in tools-sgebastion-07 by user msyn	[tools]
2021-03-02 §
15:23	<bstorm>	depooling tools-sgewebgrid-lighttpd-0914.tools.eqiad.wmflabs for reboot. It isn't communicating right	[tools]
15:22	<bstorm>	cleared queue error states...will need to keep a better eye on what's causing those	[tools]
2021-02-27 §
02:23	<bstorm>	deployed typo fix to maintain-kubeusers in an innocent effort to make the weekend better T275910	[tools]
02:00	<bstorm>	running a script to repair the dumps mount in all podpresets T275371	[tools]
2021-02-26 §
22:04	<bstorm>	cleaned up grid jobs 1230666,1908277,1908299,2441500,2441513	[tools]
21:27	<bstorm>	hard rebooting tools-sgeexec-0947	[tools]
21:21	<bstorm>	hard rebooting tools-sgeexec-0952.tools.eqiad.wmflabs	[tools]
20:01	<bd808>	Deleted csr in strange state for tool-ores-inspect	[tools]
2021-02-24 §
18:30	<bd808>	`sudo wmcs-openstack role remove --user zfilipin --project tools user` T267313	[tools]
01:04	<bstorm>	hard rebooting tools-k8s-worker-76 because it's in a sorry state	[tools]
2021-02-23 §
23:11	<bstorm>	draining a bunch of k8s workers to clean up after dumps changes T272397	[tools]
23:06	<bstorm>	draining tools-k8s-worker-55 to clean up after dumps changes T272397	[tools]
2021-02-22 §
20:40	<bstorm>	repooled tools-sgeexec-0918.tools.eqiad.wmflabs	[tools]
19:09	<bstorm>	hard rebooted tools-sgeexec-0918 from openstack T275411	[tools]
19:07	<bstorm>	shutting down tools-sgeexec-0918 with the VM's command line (not libvirt directly yet) T275411	[tools]
19:05	<bstorm>	shutting down tools-sgeexec-0918 (with openstack to see what happens) T275411	[tools]
19:03	<bstorm>	depooled tools-sgeexec-0918 T275411	[tools]
18:56	<bstorm>	deleted job 1962508 from the grid to clear it up T275301	[tools]
16:58	<bstorm>	cleared error state on several grid queues	[tools]
2021-02-19 §
12:31	<arturo>	deploying new version of toolforge ingress admission controller	[tools]
2021-02-17 §
21:26	<bstorm>	deleted tools-puppetdb-01 since it is unused at this time (and undersized anyway)	[tools]
2021-02-04 §
16:27	<bstorm>	rebooting tools-package-builder-02	[tools]
2021-01-26 §
16:27	<bd808>	Hard reboot of tools-sgeexec-0906 via Horizon for T272978	[tools]
2021-01-22 §
09:59	<dcaro>	added the record redis.svc.tools.eqiad1.wikimedia.cloud pointing to tools-redis1003 (T272679)	[tools]
2021-01-21 §
23:58	<bstorm>	deployed new maintain-kubeusers to tools T271847	[tools]
2021-01-19 §
22:57	<bstorm>	truncated 75GB error log /data/project/robokobot/virgule.err T272247	[tools]
22:48	<bstorm>	truncated 100GB error log /data/project/magnus-toolserver/error.log T272247	[tools]
22:43	<bstorm>	truncated 107GB log '/data/project/meetbot/logs/messages.log' T272247	[tools]
22:34	<bstorm>	truncating 194 GB error log '/data/project/mix-n-match/mnm-microsync.err' T272247	[tools]
16:37	<bd808>	Added Jhernandez to root sudoers group	[tools]
2021-01-14 §
20:56	<bstorm>	setting bastions to have mostly-uncapped egress network and 40MBps nfs_read for better shared use	[tools]
20:43	<bstorm>	running tc-setup across the k8s workers	[tools]
20:40	<bstorm>	running tc-setup across the grid fleet	[tools]
17:58	<bstorm>	hard rebooting tools-sgecron-01 following network issues during upgrade to stein T261134	[tools]
2021-01-13 §
10:02	<arturo>	delete floating IP allocation 185.15.56.245 (T271867)	[tools]
2021-01-12 §
18:16	<bstorm>	deleted wedged CSR tool-adhs-wde to get maintain-kubeusers working again T271842	[tools]
2021-01-05 §
18:49	<bstorm>	changing the limits on k8s etcd nodes again, so disabling puppet on them T267966	[tools]
2021-01-04 §
18:21	<bstorm>	ran 'sudo systemctl stop getty@ttyS1.service && sudo systemctl disable getty@ttyS1.service' on tools-k8s-etcd-5 I have no idea why that keeps coming back.	[tools]
2020-12-22 §
18:22	<bstorm>	rebooting the grid master because it is misbehaving following the NFS outage	[tools]
10:53	<arturo>	rebase & resolve ugly git merge conflict in labs/private.git	[tools]
2020-12-18 §
18:37	<bstorm>	set profile::wmcs::kubeadm::etcd_latency_ms: 15 T267966	[tools]
2020-12-17 §
21:42	<bstorm>	doing the same procedure to increase the timeouts more T267966	[tools]
19:56	<bstorm>	puppet enabled one at a time, letting things catch up. Timeouts are now adjusted to something closer to fsync values T267966	[tools]
19:44	<bstorm>	set etcd timeouts seed value to 20 instead of the default 10 (profile::wmcs::kubeadm::etcd_latency_ms) T267966	[tools]
18:58	<bstorm>	disabling puppet on k8s-etcd servers to alter the timeouts T267966	[tools]