1-50 of 842 results (13ms)
2021-03-03 §
17:16 <andrewbogott> restarting rabbitmq-server on cloudcontrol1003,1004,1005; trying to explain amqp errors in scheduler logs [admin]
16:03 <dcaro> draining cloudvirt1022 for T275753 [admin]
16:03 <dcaro> draining cloudvirt1022 for TT275753 [admin]
16:00 <arturo> move cloudvirt1013 into the 'toobusy' host aggregate, it has 221% cpu subscription and 82% MEM subscription [admin]
15:34 <arturo> rebooting cloudvirt1021 for T275753 [admin]
14:31 <arturo> draining cloudvirt1021 for T275753 [admin]
13:59 <arturo> rebooting cloudvirt1018 for T275753 [admin]
13:28 <arturo> draining cloudvirt1018 for T275753 [admin]
12:49 <arturo> rebooting cloudvirt1017 for T275753 [admin]
12:22 <arturo> draining cloudvirt1017 for T275753 [admin]
12:20 <arturo> rebooting cloudvirt1016 for T275753 [admin]
12:01 <arturo> draining cloudvirt1016 for T275753 [admin]
11:59 <arturo> cloudvirt1014 now in the ceph host aggregate [admin]
11:58 <arturo> rebooting cloudvirt1014 for T275753 [admin]
11:50 <arturo> moved cloudvirt1023 away from the maintenance host aggregate, leave it in the ceph aggregate (was in the 2) [admin]
11:47 <arturo> moved cloudvirt1014 to the 'maintenance' host aggregate, drain it for T275753 [admin]
10:01 <arturo> icinga-downtime cloudnet1003 for 14 days bc potential alerting storm due to firmware issues (T271058) [admin]
10:00 <arturo> rebooting again cloudnet1003 (no network failover) (T271058) [admin]
09:58 <arturo> update firmware-bnx2x from 20190114-2 to 20200918-1~bpo10+1 on cloudnet1003 (T271058) [admin]
09:30 <arturo> installing linux kernel 5.10.13-1~bpo10+1 in cloudnet1003 and rebooting it (network failover) (T271058) [admin]
2021-03-02 §
17:16 <andrewbogott> rebooting cloudvirt1039 to see if I can trigger T276208 [admin]
16:10 <arturo> [codfw1dev] restart nova-compute on cloudvirt2002-dev [admin]
11:59 <arturo> moved cloudvirt1012 to 'maintenance' host aggregate. Drain it with `wmcs-drain-hypervisor` to reboot it for T275753 [admin]
11:59 <arturo> cloudvirt1023 is affected by T276208 and cannot be rebooted. Put it back into the ceph hos aggregate [admin]
10:43 <arturo> moved cloudvirt1013 cloudvirt1032 cloudvirt1037 back into the 'ceph' host aggregate [admin]
10:13 <arturo> moved cloudvirt1023 to 'maintenance' host aggregate. Drain it with `wmcs-drain-hypervisor` to reboot it for T275753 [admin]
2021-03-01 §
20:12 <andrewbogott> removing novaadmin from all projects save 'admin' for T274385 [admin]
19:51 <andrewbogott> removing novaobserver from all projects save 'observer' for T274385 [admin]
19:50 <andrewbogott> adding inherited domain-wide roles to novaadmin and novaobserver as per T274385 [admin]
2021-02-28 §
04:54 <andrewbogott> restarted redis-server on tools-redis-1003 and tools-redis-1004 in an attempt to reduce replag, no real change detected [admin]
2021-02-27 §
00:33 <andrewbogott> sudo cumin --timeout 500 "A:all and not O{project:clouddb-services}" 'lsb_release -c | grep -i buster && uname -r | grep -v 4.19.0-14-amd64 && reboot' [admin]
00:28 <andrewbogott> sudo cumin --timeout 500 "A:all and not O{project:clouddb-services}" 'lsb_release -c | grep -i buster && uname -r | grep -v 4.19.0-14-amd64 && echo reboot' [admin]
00:09 <andrewbogott> sudo cumin "A:all and not O{project:clouddb-services}" 'lsb_release -c | grep -i stretch && uname -r | grep -v 4.19.0-0.bpo.14-amd64 && reboot' [admin]
2021-02-26 §
14:58 <dcaro> [eqiad] rebooting cloudcephosd1015 (last osd \o/) for kernel upgrade (T275753) [admin]
14:51 <dcaro> [eqiad] rebooting cloudcephosd1014 for kernel upgrade (T275753) [admin]
14:44 <dcaro> [eqiad] rebooting cloudcephosd1013 for kernel upgrade (T275753) [admin]
14:38 <dcaro> [eqiad] rebooting cloudcephosd1012 for kernel upgrade (T275753) [admin]
14:31 <dcaro> [eqiad] rebooting cloudcephosd1011 for kernel upgrade (T275753) [admin]
14:25 <dcaro> [eqiad] rebooting cloudcephosd1010 for kernel upgrade (T275753) [admin]
14:17 <dcaro> [eqiad] rebooting cloudcephosd1009 for kernel upgrade (T275753) [admin]
13:54 <dcaro> [eqiad] downtimed alert1001 Ceph OSDs down alert until 18:00 GMT+1 as that is not under the host being rebooted (T275753) [admin]
13:51 <dcaro> [eqiad] rebooting cloudcephosd1008 for kernel upgrade (T275753) [admin]
13:45 <dcaro> [eqiad] rebooting cloudcephosd1007 for kernel upgrade (T275753) [admin]
13:38 <dcaro> [eqiad] rebooting cloudcephosd1006 for kernel upgrade (T275753) [admin]
12:07 <dcaro> [eqiad] rebooting cloudcephosd1005 for kernel upgrade (T275753) [admin]
12:00 <arturo> rebooting cloudcontrol1003 for kernel upgrade (T275753) [admin]
11:42 <arturo> rebooting cloudcontrol1004 for kernel upgrade (T275753) [admin]
11:41 <dcaro> [eqiad] rebooting cloudcephosd1004 for kernel upgrade (T275753) [admin]
11:32 <dcaro> [eqiad] rebooting cloudcephosd1003 for kernel upgrade (T275753) [admin]
11:30 <arturo> rebooting cloudcontrol1005 for kernel upgrade (T2 [admin]