1-50 of 585 results (16ms)
2020-11-25 §
19:35 <bstorm> repairing ceph pg `instructing pg 6.91 on osd.117 to repair` [admin]
09:31 <_dcaro> The OSD seems to be up and running actually, though there's that misleading log, will leave it see if the cluster comes fully healthy (T268722) [admin]
08:54 <_dcaro> Unsetting noup/nodown to allow re-shuffling of the pgs that osd.44 had, will try to rebuild it (T268722) [admin]
08:45 <_dcaro> Tried resetting the class for osd.44 to ssd, no luck, the cluster is in noout/norebalance to avoid data shuffling (opened T268722) [admin]
08:45 <_dcaro> Tried resetting the class for osd.44 to ssd, no luck, the cluster is in noout/norebalance to avoid data shuffling (opened root@cloudcephosd1005:/var/lib/ceph/osd/ceph-44# ceph osd crush set-device-class ssd osd.44) [admin]
08:19 <_dcaro> Restarting serivce osd.44 resulted on osd.44 being unable to start due to some config inconsistency (can not reset class to hdd) [admin]
08:16 <_dcaro> After enabling auto pg scaling on ceph eqiad cluster, osd.44 (cloudcephosd1005) got stuck, trying to restart the osd service [admin]
08:16 <_dcaro> After enabling auto pg scaling on ceph eqiad cluster, osd.44 (cloudcephosd1005) got stuck, trying to restart [admin]
2020-11-22 §
17:40 <andrewbogott> apt-get upgrade on cloudservices1003/1004 [admin]
17:32 <andrewbogott> upgrading Designate on cloudservices1003/1004 to Stein [admin]
2020-11-20 §
12:44 <arturo> [codfw1dev] install conntrackd in cloudnet2003-dev/cloudnet2002-dev to research l3 agent HA reliability [admin]
09:26 <arturo> incinga downtime labstore1006 RAID checks for 10 days (T268281) [admin]
2020-11-17 §
19:21 <andrewbogott> draining cloudvirt1012 to experiment with libvirt/cpu things [admin]
2020-11-15 §
11:21 <arturo> icinga downtime cloudbackup2002 for 48h (T267865) [admin]
2020-11-10 §
16:37 <arturo> icinga downtime toolschecker for 2h becasue toolsdb maintenance (T266587) [admin]
11:24 <arturo> [codfw1dev] enable puppet in puppetmaster01.cloudinfra-codfw1dev (disabled for unspecified reasons) [admin]
2020-11-09 §
12:42 <arturo> restarted neutron l3 agent in cloudnet1003 bc it still had the old default route (T265288) [admin]
12:41 <arturo> `root@cloudcontrol1005:~# neutron subnet-delete dcbb0f98-5e9d-4a93-8dfc-4e3ec3c44dcc` (T265288) [admin]
12:40 <arturo> `root@cloudcontrol1005:~# neutron router-gateway-set --fixed-ip subnet_id=7c6bcc12-212f-44c2-9954-5c55002ee371,ip_address=185.15.56.244 cloudinstances2b-gw wan-transport-eqiad` (T265288) [admin]
12:19 <arturo> subnet 185.1.5.56.240/29 has id 7c6bcc12-212f-44c2-9954-5c55002ee371 in neutron (T265288) [admin]
12:19 <arturo> `root@cloudcontrol1005:~# neutron subnet-create --gateway 185.15.56.241 --name cloud-instances-transport1-b-eqiad1 --ip-version 4 --disable-dhcp wan-transport-eqiad 185.15.56.240/29` (T265288) [admin]
12:15 <arturo> icinga-downtime toolschecker for 2h (T265288) [admin]
2020-11-02 §
13:36 <arturo> (typo: dcaro) [admin]
13:35 <arturo> added dcar as projectadmin & user (T266068) [admin]
2020-10-29 §
16:57 <bstorm> silenced deployment-prep project alerts for 60 days since the downtime expired [admin]
08:12 <arturo> force-powercycling cloudcephosd1006 [admin]
2020-10-25 §
16:20 <andrewbogott> adding cloudvirt1038 to the 'ceph' aggregate and removing from the 'spare' aggregate. We need this space while waiting on network upgrades for empty cloudvirts (T216195) [admin]
2020-10-23 §
11:30 <arturo> [codfw1dev] openstack --os-project-id cloudinfra-codfw1dev recordset create --type PTR --record nat.cloudgw.codfw1dev.wikimediacloud.org. --description "created by hand" 0-29.57.15.185.in-addr.arpa. 1.0-29.57.15.185.in-addr.arpa. (T261724) [admin]
10:09 <arturo> [codf1dev] doing DNS changes for the cloudgw PoC, including designate and https://gerrit.wikimedia.org/r/c/operations/dns/+/635965 (T261724) [admin]
2020-10-22 §
10:46 <arturo> [codfw1dev] rebooting cloudinfra-internal-puppetmaster-01.cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud to try fixing some DNS weirdness [admin]
09:43 <arturo> enabling puppet in cloucontrol1003 (message said "please re-enable after 2020-10-22 06:00UTC") [admin]
2020-10-21 §
14:36 <andrewbogott> running apt-get update && apt-get install -y facter on all cloud-vps instances [admin]
10:31 <arturo> [codfw1dev] reimaging labtestvirt2003 (cloudgw) to test puppet code (T261724) [admin]
08:56 <arturo> [codfw1dev] reimaging labtestvirt2003 (cloudgw) to test puppet code (T261724) [admin]
2020-10-20 §
15:47 <arturo> changing DNS recursor ACLs (https://gerrit.wikimedia.org/r/c/operations/puppet/+/635314) this can be reverted any time if it causes problems (T261724) [admin]
14:49 <arturo> [codfw1dev] reimaging labtestvirt2003 (cloudgw) to test puppet code (T261724) [admin]
2020-10-19 §
01:41 <andrewbogott> deleting all Precise base images [admin]
01:36 <andrewbogott> deleting all unused Jessie base images [admin]
2020-10-18 §
23:26 <andrewbogott> deleting all Trusty base images [admin]
21:50 <andrewbogott> migrating all currently used ceph images to rbd [admin]
2020-10-16 §
09:29 <arturo> [codfw1dev] still some DNS weirdness, investigating [admin]
09:25 <arturo> [codfw1dev] hard-rebooting bastion-codfw1dev-02, seems in bad shape, doesn't even wake up in the virsh console [admin]
09:18 <arturo> [codfw1dev] live-hacked cloudservices2002-dev /etc/powerdns/recursor.conf file to include cloud-codfw1dev-floating CIDR (185.15.57.0/29) while https://gerrit.wikimedia.org/r/c/operations/puppet/+/634050 is in review, so VMs with a floating IP can query the DNS recursor (T261724) [admin]
09:01 <arturo> [codfw1dev] basic network connectivity seems stable after cleaning up everything related to address scopes (T261724) [admin]
2020-10-15 §
15:17 <arturo> [codfw1dev] try cleaning up anything related to address scopes in the neutron database (T261724) [admin]
13:56 <arturo> [codfw1dev] drop neutron l3 agent hacks in cloudnet2002/2003-dev (T261724) [admin]
2020-10-13 §
17:54 <andrewbogott> rebuilding cloudvirt1021 for backy support [admin]
15:22 <andrewbogott> draining cloudvirt1021 so I can rebuild it with backy support [admin]
14:19 <andrewbogott> rebuilding cloudvirt1022 with backy support [admin]
14:03 <andrewbogott> draining cloudvirt1022 so I can rebuild it with backy support [admin]