2019-10-02 §
09:46 <arturo> codfw1dev: cleanup leftover neutron agents [admin]
2019-09-30 §
10:21 <arturo> we installed ferm in every VM by mistake. Deleting it and forcing a puppet agent run to try to go back to a clean state. [admin]
09:38 <arturo> downtime toolschecker for 24h [admin]
09:33 <arturo> force update ferm cloud-wide (in all VMs) for T153468 [admin]
2019-08-18 §
10:39 <arturo> rebooting cloudvirt1023 for new interface names configuration [admin]
10:34 <arturo> downtimed cloudvirt1023 for 2 days [admin]
2019-08-05 §
17:17 <bd808> Set downtime on gridengine and kubernetes webservice checks in icinga until 2019-09-02 (flaky tests) [admin]
2019-07-29 §
20:14 <bd808> Restarted maintain-kubeusers on tools-k8s-master-01 (T194859) [admin]
2019-07-25 §
12:32 <arturo> eqiad1/glance: debian-9.9-stretch image deprecates debian-9.8-stretch (T228983) [admin]
09:59 <arturo> (codfw1dev) drop missing glance images (T228972) [admin]
09:32 <arturo> (codfw1dev) deleting a bunch of VMs that were running in now missing hypervisors [admin]
09:31 <arturo> (codfw1dev) deleting a bunch of VMs in ERROR and SHUTDOWN state [admin]
09:27 <arturo> last log entry refers to the codfw1dev deployment [admin]
09:27 <arturo> cleanup `nova service-list` from old hypervisors (labtest*) [admin]
09:23 <arturo> refreshed nova DB grants in clouddb2001-dev for the codfw1dev deployment [admin]
08:47 <arturo> cleanup the cloud-announce pending emails (spam) [admin]
2019-07-23 §
19:43 <andrewbogott> restarting rabbitmq-server on cloudcontrol1003 and 1004 [admin]
2019-07-22 §
23:44 <bd808> Restarted maintain-kubeusers on tools-k8s-master-01 (T228529) [admin]
2019-07-11 §
22:07 <bd808> Ran `sudo systemctl stop designate_floating_ip_ptr_records_updater.service` on cloudcontrol1003 [admin]
22:01 <bd808> `sudo apt-get install python2.7-dbg` on cloudcontrol1003 to debug hung python process [admin]
21:48 <bd808> Ran `sudo systemctl stop designate_floating_ip_ptr_records_updater.service` on cloudcontrol1004 [admin]
2019-06-25 §
16:05 <bstorm_> updated python3.4 to update4 wherever it was installed on Jessie VMs to prevent issues with broken update3. [admin]
14:56 <bstorm_> Updated python 3.4 on the labs-puppetmaster server [admin]
2019-06-03 §
15:55 <arturo> T221769 rebooting cloudservices1003 after bootstrapping is apparently completed [admin]
2019-05-28 §
21:42 <bstorm_> unmounting labstore1003-scratch on all cloud clients [admin]
18:14 <bstorm_> T209527 switched mounts from labstore1003 to cloudstore1008 for scratch [admin]
2019-05-20 §
17:25 <arturo> T223923 dropped compat-network config from /etc/network/interfaces in eqiad1/codfw1dev neutron nodes [admin]
17:22 <arturo> T223923 dropped br-compat bridges and vlan interfaces (1102 and 2102) in eqiad1/codfw1dev neutron nodes [admin]
17:07 <arturo> T223923 dropped compat-network configuration from the neutron database in eqiad1 [admin]
16:55 <arturo> T223923 dropped compat-network configuration from the neutron database in codfw1dev [admin]
2019-05-15 §
17:00 <andrewbogott> touching /root/firstboot_done on all VMs that cumin can reach. This will prevent firstboot.sh from running a second time if/when any of these are rebooted. T223370 [admin]
2019-04-26 §
15:51 <arturo> andrew updated dns servers for the cloud-instances2-b-eqiad subnet in neutron: and [admin]
2019-04-25 §
11:14 <arturo> T221760 increased size of conntrack table [admin]
2019-04-24 §
12:54 <arturo> T220051 puppet broken in every VM in Cloud VPS, fixing right now [admin]
2019-04-22 §
11:14 <arturo> create by hand /var/cache/labsaliaser/labs-ip-aliases.json in cloudservices2002-dev (T218575) [admin]
2019-04-16 §
22:55 <bd808> cloudcontrol2003-dev: added `exit 0` to /etc/cron.hourly/keystone to stop cron spam on partially configured cluster [admin]
12:08 <arturo> rebooting cloudvirt200[123]-dev because deep changes in config [admin]
11:27 <arturo> T219626 add DB grants for neutron and glnace to clouddb2001-dev (codfw1dev) [admin]
10:37 <arturo> T219626 replace with in the clouddb2001-dev database (codfw1dev deployment) [admin]
10:29 <arturo> T219626 replace labtestcontrol2003 with cloudcontrol2001-dev in the clouddb2001-dev database (codfw1dev deployment) [admin]
2019-04-15 §
13:08 <arturo> T219626 add DB grants for keystone/nova/nova_api to clouddb2001-dev (codfw1dev) [admin]
2019-04-13 §
18:25 <bd808> Restarted nova-compute service on cloudvirt1015 (T220853) [admin]
2019-04-02 §
19:52 <andrewbogott> installed new base Stretch image. Updated packages, and runs apt-get dist-upgrade on first boot. [admin]
2019-03-29 §
14:34 <andrewbogott> moving tools-static.wmflabs.org to point to tools-static-13 in eqiad1-r [admin]
00:00 <bstorm_> T193264 Added osm.db.svc.eqiad.wmflabs to cloud DNS [admin]
2019-03-25 §
00:40 <bd808> Restarted maintain-dbusers on labstore1004. Process hung up on failed LDAP connection. [admin]
2019-03-21 §
19:32 <andrewbogott> restarting keystone on cloudcontrol1003 [admin]
13:49 <gtirloni> converted openstack cronjobs to systemd timers (T210818) [admin]
00:49 <gtirloni> nfs-exportd interval changes from 60 to 300s (T217086) [admin]
2019-03-15 §
16:00 <gtirloni> increased nscd cache size (T217280) [admin]