admin SAL

1-50 of 842 results (18ms)

2021-03-03 §
17:16	<andrewbogott>	restarting rabbitmq-server on cloudcontrol1003,1004,1005; trying to explain amqp errors in scheduler logs	[admin]
16:03	<dcaro>	draining cloudvirt1022 for T275753	[admin]
16:03	<dcaro>	draining cloudvirt1022 for TT275753	[admin]
16:00	<arturo>	move cloudvirt1013 into the 'toobusy' host aggregate, it has 221% cpu subscription and 82% MEM subscription	[admin]
15:34	<arturo>	rebooting cloudvirt1021 for T275753	[admin]
14:31	<arturo>	draining cloudvirt1021 for T275753	[admin]
13:59	<arturo>	rebooting cloudvirt1018 for T275753	[admin]
13:28	<arturo>	draining cloudvirt1018 for T275753	[admin]
12:49	<arturo>	rebooting cloudvirt1017 for T275753	[admin]
12:22	<arturo>	draining cloudvirt1017 for T275753	[admin]
12:20	<arturo>	rebooting cloudvirt1016 for T275753	[admin]
12:01	<arturo>	draining cloudvirt1016 for T275753	[admin]
11:59	<arturo>	cloudvirt1014 now in the ceph host aggregate	[admin]
11:58	<arturo>	rebooting cloudvirt1014 for T275753	[admin]
11:50	<arturo>	moved cloudvirt1023 away from the maintenance host aggregate, leave it in the ceph aggregate (was in the 2)	[admin]
11:47	<arturo>	moved cloudvirt1014 to the 'maintenance' host aggregate, drain it for T275753	[admin]
10:01	<arturo>	icinga-downtime cloudnet1003 for 14 days bc potential alerting storm due to firmware issues (T271058)	[admin]
10:00	<arturo>	rebooting again cloudnet1003 (no network failover) (T271058)	[admin]
09:58	<arturo>	update firmware-bnx2x from 20190114-2 to 20200918-1~bpo10+1 on cloudnet1003 (T271058)	[admin]
09:30	<arturo>	installing linux kernel 5.10.13-1~bpo10+1 in cloudnet1003 and rebooting it (network failover) (T271058)	[admin]
2021-03-02 §
17:16	<andrewbogott>	rebooting cloudvirt1039 to see if I can trigger T276208	[admin]
16:10	<arturo>	[codfw1dev] restart nova-compute on cloudvirt2002-dev	[admin]
11:59	<arturo>	moved cloudvirt1012 to 'maintenance' host aggregate. Drain it with `wmcs-drain-hypervisor` to reboot it for T275753	[admin]
11:59	<arturo>	cloudvirt1023 is affected by T276208 and cannot be rebooted. Put it back into the ceph hos aggregate	[admin]
10:43	<arturo>	moved cloudvirt1013 cloudvirt1032 cloudvirt1037 back into the 'ceph' host aggregate	[admin]
10:13	<arturo>	moved cloudvirt1023 to 'maintenance' host aggregate. Drain it with `wmcs-drain-hypervisor` to reboot it for T275753	[admin]
2021-03-01 §
20:12	<andrewbogott>	removing novaadmin from all projects save 'admin' for T274385	[admin]
19:51	<andrewbogott>	removing novaobserver from all projects save 'observer' for T274385	[admin]
19:50	<andrewbogott>	adding inherited domain-wide roles to novaadmin and novaobserver as per T274385	[admin]
2021-02-28 §
04:54	<andrewbogott>	restarted redis-server on tools-redis-1003 and tools-redis-1004 in an attempt to reduce replag, no real change detected	[admin]
2021-02-27 §
00:33	<andrewbogott>	sudo cumin --timeout 500 "A:all and not O{project:clouddb-services}" 'lsb_release -c \| grep -i buster && uname -r \| grep -v 4.19.0-14-amd64 && reboot'	[admin]
00:28	<andrewbogott>	sudo cumin --timeout 500 "A:all and not O{project:clouddb-services}" 'lsb_release -c \| grep -i buster && uname -r \| grep -v 4.19.0-14-amd64 && echo reboot'	[admin]
00:09	<andrewbogott>	sudo cumin "A:all and not O{project:clouddb-services}" 'lsb_release -c \| grep -i stretch && uname -r \| grep -v 4.19.0-0.bpo.14-amd64 && reboot'	[admin]
2021-02-26 §
14:58	<dcaro>	[eqiad] rebooting cloudcephosd1015 (last osd \o/) for kernel upgrade (T275753)	[admin]
14:51	<dcaro>	[eqiad] rebooting cloudcephosd1014 for kernel upgrade (T275753)	[admin]
14:44	<dcaro>	[eqiad] rebooting cloudcephosd1013 for kernel upgrade (T275753)	[admin]
14:38	<dcaro>	[eqiad] rebooting cloudcephosd1012 for kernel upgrade (T275753)	[admin]
14:31	<dcaro>	[eqiad] rebooting cloudcephosd1011 for kernel upgrade (T275753)	[admin]
14:25	<dcaro>	[eqiad] rebooting cloudcephosd1010 for kernel upgrade (T275753)	[admin]
14:17	<dcaro>	[eqiad] rebooting cloudcephosd1009 for kernel upgrade (T275753)	[admin]
13:54	<dcaro>	[eqiad] downtimed alert1001 Ceph OSDs down alert until 18:00 GMT+1 as that is not under the host being rebooted (T275753)	[admin]
13:51	<dcaro>	[eqiad] rebooting cloudcephosd1008 for kernel upgrade (T275753)	[admin]
13:45	<dcaro>	[eqiad] rebooting cloudcephosd1007 for kernel upgrade (T275753)	[admin]
13:38	<dcaro>	[eqiad] rebooting cloudcephosd1006 for kernel upgrade (T275753)	[admin]
12:07	<dcaro>	[eqiad] rebooting cloudcephosd1005 for kernel upgrade (T275753)	[admin]
12:00	<arturo>	rebooting cloudcontrol1003 for kernel upgrade (T275753)	[admin]
11:42	<arturo>	rebooting cloudcontrol1004 for kernel upgrade (T275753)	[admin]
11:41	<dcaro>	[eqiad] rebooting cloudcephosd1004 for kernel upgrade (T275753)	[admin]
11:32	<dcaro>	[eqiad] rebooting cloudcephosd1003 for kernel upgrade (T275753)	[admin]
11:30	<arturo>	rebooting cloudcontrol1005 for kernel upgrade (T2	[admin]