1-50 of 618 results (10ms)
2020-12-04 §
22:33 <andrewbogott> moving cloudvirt1023 back into the ceph aggregate; it doesn't need upgrades after all T269467 [admin]
22:23 <andrewbogott> moving cloudvirt1023 out of the ceph aggregate and into maintenance for T269467 [admin]
21:06 <andrewbogott> putting cloudvirt1025 and 1026 back into service because I'm pretty sure they're fixed. T269313 [admin]
12:12 <arturo> manually running `wmcs-purge-backups` again on cloudvirt1024 (T269419) [admin]
11:25 <arturo> icinga downtime cloudvirt1024 for 6 days, to avoid paging noises (T269419) [admin]
11:25 <arturo> last log line referencing cloudvirt1024 is a mistake (T269313) [admin]
11:24 <arturo> icinga downtime cloudvirt1024 for 6 days, to avoid paging noises (T269313) [admin]
10:28 <arturo> manually running `wmcs-purge-backups` on cloudvirt1024 (T269419) [admin]
10:23 <arturo> setting expiration to 2020-12-03 to the oldest backy snapshot of every VM in cloudvirt1024 (T269419) [admin]
09:54 <arturo> icinga downtime cloudvirt1025 for 6 days (T269313) [admin]
2020-12-03 §
23:21 <andrewbogott> removing all osds on cloudcephosd1004 for rebuild, T268746 [admin]
21:45 <andrewbogott> removing all osds on cloudcephosd1005 for rebuild, T268746 [admin]
19:51 <andrewbogott> removing all osds on cloudcephosd1006 for rebuild, T268746 [admin]
17:01 <arturo> icinga downtime cloudvirt1025 for 48h to debug network issue T269313 [admin]
16:56 <arturo> rebooting cloudvirt1025 to debug network issue T269313 [admin]
16:38 <dcaro> Rimaging cloudvirt1026 (T216195) [admin]
13:24 <andrewbogott> removing all osds on cloudcephosd1008 for rebuild, T268746 [admin]
02:55 <andrewbogott> removing all osds on cloudcephosd1009 for rebuild, T268746 [admin]
2020-12-02 §
20:03 <andrewbogott> removing all osds on cloudcephosd1010 for rebuild, T268746 [admin]
17:25 <arturo> [15:51] failovering neutron virtual router in eqiad1 (T268335) [admin]
15:36 <arturo> conntrackd is now up and running in cloudnet1003/1004 nodes (T268335) [admin]
15:33 <arturo> [codfw1dev] conntrackd is now up and running in cloudnet200x-dev nodes (T268335) [admin]
15:08 <andrewbogott> removing all osds on cloudcephosd1012 for rebuild, T268746 [admin]
12:41 <arturo> disable puppet in all cloudnet servers to merge conntrackd change T268335 [admin]
11:12 <dcaro> Reset the properties for the flavor g2.cores8.ram16.disk1120 to correct quotes (T269172) [admin]
09:56 <arturo> moved cloudvirts 1030, 1029, 1028, 1027, 1026, 1025 away from the 'standard' host aggregate to 'maintenance' (T269172) [admin]
2020-12-01 §
20:06 <andrewbogott> removing all osds on cloudcephosd1014 for rebuild, T268746 [admin]
12:04 <arturo> restarting neutron l3 agents to pick up config change [admin]
11:48 <arturo> merging change to dmz_dir, detail list of private address https://gerrit.wikimedia.org/r/c/operations/puppet/+/641977 [admin]
2020-11-30 §
18:12 <andrewbogott> removing all osds from cloudcephosd1015 in order to investigate T268746 [admin]
2020-11-29 §
17:18 <andrewbogott> cleaning up some logfiles in tools-sgecron-01 — drive is full [admin]
2020-11-26 §
22:58 <andrewbogott> deleting /var/log/haproxy logs older than 7 days in cloudcontrol100x. We need log rotation here it seems. [admin]
15:53 <dcaro> Created private flavor g2.cores8.ram16.disk1120 for wikidumpparse (T268190) [admin]
2020-11-25 §
19:35 <bstorm> repairing ceph pg `instructing pg 6.91 on osd.117 to repair` [admin]
09:31 <_dcaro> The OSD seems to be up and running actually, though there's that misleading log, will leave it see if the cluster comes fully healthy (T268722) [admin]
08:54 <_dcaro> Unsetting noup/nodown to allow re-shuffling of the pgs that osd.44 had, will try to rebuild it (T268722) [admin]
08:45 <_dcaro> Tried resetting the class for osd.44 to ssd, no luck, the cluster is in noout/norebalance to avoid data shuffling (opened T268722) [admin]
08:45 <_dcaro> Tried resetting the class for osd.44 to ssd, no luck, the cluster is in noout/norebalance to avoid data shuffling (opened root@cloudcephosd1005:/var/lib/ceph/osd/ceph-44# ceph osd crush set-device-class ssd osd.44) [admin]
08:19 <_dcaro> Restarting serivce osd.44 resulted on osd.44 being unable to start due to some config inconsistency (can not reset class to hdd) [admin]
08:16 <_dcaro> After enabling auto pg scaling on ceph eqiad cluster, osd.44 (cloudcephosd1005) got stuck, trying to restart the osd service [admin]
08:16 <_dcaro> After enabling auto pg scaling on ceph eqiad cluster, osd.44 (cloudcephosd1005) got stuck, trying to restart [admin]
2020-11-22 §
17:40 <andrewbogott> apt-get upgrade on cloudservices1003/1004 [admin]
17:32 <andrewbogott> upgrading Designate on cloudservices1003/1004 to Stein [admin]
2020-11-20 §
12:44 <arturo> [codfw1dev] install conntrackd in cloudnet2003-dev/cloudnet2002-dev to research l3 agent HA reliability [admin]
09:26 <arturo> incinga downtime labstore1006 RAID checks for 10 days (T268281) [admin]
2020-11-17 §
19:21 <andrewbogott> draining cloudvirt1012 to experiment with libvirt/cpu things [admin]
2020-11-15 §
11:21 <arturo> icinga downtime cloudbackup2002 for 48h (T267865) [admin]
2020-11-10 §
16:37 <arturo> icinga downtime toolschecker for 2h becasue toolsdb maintenance (T266587) [admin]
11:24 <arturo> [codfw1dev] enable puppet in puppetmaster01.cloudinfra-codfw1dev (disabled for unspecified reasons) [admin]
2020-11-09 §
12:42 <arturo> restarted neutron l3 agent in cloudnet1003 bc it still had the old default route (T265288) [admin]