2020-02-29 §
16:32 <bstorm_> downtimed the smart alert on cloudvirt1009 until Monday since apparently predictive failures flap T244986 [admin]
2020-02-26 §
22:03 <jeh> powering down cloudvirt1014 for hardware maintenance [admin]
2020-02-25 §
16:08 <andrewbogott> changing neutron's rabbitmq password because oslo is having trouble parsing some of the characters in the password [admin]
15:26 <andrewbogott> updated the cell_mapping record in the nova_api database to add the second rabbitmq server to the transport_url field [admin]
15:26 <andrewbogott> updated the cell_mapping record in the nova_api database to set the db uri to 'mysql+pymysql' -- this in response to a deprecation notice [admin]
2020-02-24 §
12:16 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# neutron bgp-speaker-peer-add bgpspeaker cr2-codfw` (T245606) [admin]
12:16 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# neutron bgp-speaker-peer-add bgpspeaker cr1-codfw` (T245606) [admin]
12:09 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# neutron bgp-peer-create --peer-ip --remote-as 65002 cr2-codfw` (T245606) [admin]
12:09 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# neutron bgp-peer-create --peer-ip --remote-as 65002 cr1-codfw` (T245606) [admin]
12:06 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# neutron bgp-peer-delete 17b8c2a3-f0ce-4d50-a265-18ccac703c61` (T245606) [admin]
10:59 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# neutron bgp-speaker-peer-add bgpspeaker bgppeer` (T245606) [admin]
10:56 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# neutron bgp-peer-create --peer-ip --remote-as 65002 bgppeer` (T245606) [admin]
2020-02-21 §
12:48 <arturo> [codfw1dev] running `root@cloudcontrol2001-dev:~# neutron bgp-speaker-network-add bgpspeaker wan-transport-codfw` (T245606) [admin]
12:46 <arturo> [codfw1dev] created bgpspeaker for AS64711 (T245606) [admin]
12:42 <arturo> [codfw1dev] run `sudo neutron-db-manage upgrade head` to upgrade the db schema for neutron bgp tables [admin]
11:51 <arturo> [codfw1dev] create a neutron subnet pool per each subnet objects we have and manually update DB to inter-associate them (T245606) [admin]
11:49 <arturo> [codfw1dev] rename neutron address scope `no-nat` to `bgp` (T245606) [admin]
11:37 <arturo> [codfw1dev] cleanup unused neutron subnet pools from previous address scope tests (T244851) [admin]
2020-02-20 §
19:22 <andrewbogott> updating designate pool config for https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/572213/ [admin]
15:33 <andrewbogott> migrating all VMs on cloudvirt1014 to cloudvirt1022 [admin]
13:35 <arturo> [codfw1dev] disable puppet in cloudcontrol servers to hack neutron.conf for tests related to T245606 [admin]
13:33 <arturo> [codfw1dev] disable puppet in cloudnet servers to hack neutron.conf for tests related to T245606 [admin]
2020-02-18 §
22:19 <andrewbogott> transferred the tools.wmcloud.org. to the tools project [admin]
22:16 <andrewbogott> moved wmcloud.org dns domain to the cloud-infra project [admin]
21:02 <andrewbogott> adding .eqiad1.wikimedia.cloud records to all existing eqiad1 VMs, updating all eqiad1 internal pointer records to reference the new eqiad1.wikimedia.cloud fqdns. [admin]
09:44 <arturo> deleted DNS zone wmcloud.org and try re-creating it [admin]
2020-02-14 §
10:35 <arturo> running `root@cloudcontrol2001-dev:~# designate server-create --name ns1.openstack.codfw1dev.wikimediacloud.org.` (T243766) [admin]
10:32 <arturo> running `root@cloudcontrol1004:~# designate server-create --name ns1.openstack.eqiad1.wikimediacloud.org.` (T243766) [admin]
10:32 <arturo> running `root@cloudcontrol1004:~# designate server-create --name ns0.openstack.eqiad1.wikimediacloud.org.` (T243766) [admin]
2020-02-12 §
13:38 <arturo> [codfw1dev] add reference to subnetpool to the instance subnet `MariaDB [neutron]> update subnets set subnetpool_id='d129650d-d4be-4fe1-b13e-6edb5565cb4a' where id = '7adfcebe-b3d0-4315-92fe-e8365cc80668';` (T244851) [admin]
2020-02-11 §
13:46 <arturo> [codfw1dev] creating some neutron objects to investigate T244851 (subnets, subnet pools, address scopes, ...) [admin]
12:40 <arturo> [codfw1dev] delete unknown address scope 'wmcs-v4-scope': `root@cloudcontrol2001-dev:~# openstack address scope delete 078cfd71-117b-4aac-9197-6ebbbb7dd3de` (T244851) [admin]
12:40 <arturo> [codfw1dev] delete unknown subnet pool 'cloudinstancesb-v4-pool0': `root@cloudcontrol2001-dev:~# openstack subnet pool delete d23a9b88-5c3d-4a53-ab88-053233a75365` (T244851) [admin]
2020-02-07 §
18:11 <jeh> shutdown cloudvirt1016 for hardware maintenance T241882 [admin]
2020-02-06 §
14:44 <jeh> update apt packages on cloudvirt1015 T220853 [admin]
14:28 <jeh> run hardware tests on cloudvirt1015 T220853 [admin]
2020-01-28 §
17:24 <arturo> [codfw1dev] root@cloudcontrol2001-dev:~# designate server-create --name ns0.openstack.codfw1dev.wikimediacloud.org. (T243766) [admin]
10:18 <arturo> [codfw1dev] created DNS record `bastion-codfw1dev-01.codfw1dev.wmcloud.org A` (T242976, T229441) [admin]
10:13 <arturo> [codfw1dev] the zone `codfw1dev.wmcloud.org` belongs now to the `cloudinfra-codfw1dev` project (T242976) [admin]
10:11 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# openstack zone create --description "main DNS domain for public addresses" --email "root@wmflabs.org" --type PRIMARY --ttl 3600 codfw1dev.wmcloud.org.` (T242976 and T243766) [admin]
09:53 <arturo> restart apache2 in labweb1001/1002 because horizon errors [admin]
09:47 <arturo> created DNS zone wmcloud.org in eqiad1, transfer it to the cloudinfra project (T242976) right now only use is to delegate codfw1dev.wmcloud.org subdomain to designate in the other deployment [admin]
2020-01-27 §
12:45 <arturo> [codfw1dev] manually move the new domain to the `cloudinfra-codfw1dev` project clouddb2001-dev: `[designate]> update zones set tenant_id='cloudinfra-codfw1dev' where id = '4c75410017904858a5839de93c9e8b3d';` T243556 [admin]
12:44 <arturo> [codfw1dev] `root@cloudcontrol2001-dev:~# openstack zone create --description "main DNS domain for VMs" --email "root@wmflabs.org" --type PRIMARY --ttl 3600 codfw1dev.wikimedia.cloud.` T243556 [admin]
2020-01-24 §
15:10 <jeh> remove icinga downtime for cloudvirt1013 T241313 [admin]
12:52 <arturo> repooling cloudvirt1013 after HW got fixed (T241313) [admin]
2020-01-21 §
17:43 <bstorm_> remounting /mnt/nfs/dumps-labstore1007.wikimedia.org/ on all dumps-mounting projects [admin]
10:24 <arturo> running `sudo systemctl restart apache2.service` in both labweb servers to try mitigating T240852 [admin]
2020-01-15 §
16:59 <bd808> Changed the config for cloud-announce mailing list so that lsit admins do not get bounce unsubscribe notices [admin]
2020-01-14 §
14:03 <arturo> icinga downtime all cloudvirts for another 2h for fixing some icinga checks [admin]