1-50 of 970 results (8ms)
2021-04-28 §
21:11 <andrewbogott> cleaning up more references to deleted hypervisors with delete from services where topic='compute' and version != 53; [admin]
20:48 <andrewbogott> cleaning up references to deleted hypervisors with mysql:root@localhost [nova_eqiad1]> delete from compute_nodes where hypervisor_version != '5002000'; [admin]
19:40 <andrewbogott> putting cloudvirt1040 into the maintenance aggregate pending more info about T281399 [admin]
18:11 <andrewbogott> adding cloudvirt1040, 1041 and 1042 to the 'ceph' host aggregate -- T275081 [admin]
11:06 <dcaro> All ceph server side upgraded to Octopus! \o/ (T280641) [admin]
10:57 <dcaro> Got a PG getting stuck on 'remapping' after the OSD came up, had to unset the norebalance and then set it again to get it unstuck (T280641) [admin]
10:34 <dcaro> Slow/blocked opns from cloudcephmon03, "osd_failure(failed timeout osd.32..." (cloudcephosd1005), unset the cluster noout/norebalance and went away in a few secs, setting it again and continuing... (T280641) [admin]
09:03 <dcaro> Waiting for slow heartbeats from osd.58(cloudcephosd1002) to recover... (T280641) [admin]
08:59 <dcaro> During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58, currently on cloudcephosd1002 (T280641) [admin]
08:58 <dcaro> During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58 (T280641) [admin]
08:58 <dcaro> During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) (T280641) [admin]
08:21 <dcaro> Upgrading all the ceph osds on eqiad (T280641) [admin]
08:21 <dcaro> The clock skew seems intermittent, there's another task to follw it T275860 (T280641) [admin]
08:18 <dcaro> All equiad ceph mons and mgrs upgraded (T280641) [admin]
08:18 <dcaro> During the upgrade, ceph detected a clock skew on cloudcephmon1002, cloudcephmon1001, they are back (T280641) [admin]
08:15 <dcaro> During the upgrade, ceph detected a clock skew on cloudcephmon1002, it went away, I'm guessing systemd-timesyncd fixed it (T280641) [admin]
08:14 <dcaro> During the upgrade, ceph detected a clock skew on cloudcephmon1002, looking (T280641) [admin]
07:58 <dcaro> Upgrading ceph services on eqiad, starting with mons/managers (T280641) [admin]
2021-04-27 §
14:10 <dcaro> codfw.openstack upgraded ceph libraries to 15.2.11 (T280641) [admin]
13:07 <dcaro> codfw.openstack cloudvirt2002-dev done, taking cloudvirt2003-dev out to upgrade ceph libraries (T280641) [admin]
13:00 <dcaro> codfw.openstack cloudvirt2001-dev back online, taking cloudvirt2002-dev out to upgrade ceph libraries (T280641) [admin]
10:51 <dcaro> ceph.eqiad: cinder pool got it's pg_num increased to 1024, re-shuffle started (T273783) [admin]
10:48 <dcaro> ceph.eqiad: Tweaked the target_size_ratio of all the pools, enabling autoscaler (it will increase cinder pool only) (T273783) [admin]
09:14 <dcaro> manually force stopping the server puppetmaster-01 to unblock migration (in codfw1) [admin]
09:14 <dcaro> manually force stopping the server puppetmaster-01 to unblock migration [admin]
08:59 <dcaro> manually force stopping the server exploding-head on codfw, to try cold migration [admin]
08:47 <dcaro> restarting nova-compute on cloudvirt2001-dev after upgrading ceph libraries to 15.2.11 [admin]
2021-04-26 §
20:56 <andrewbogott> deleting spurious 'codfw1dev' and 'codw1dev-4' regions in the dallas deployment; regions without endpoints break a bunch of things [admin]
09:45 <dcaro> draining cloudvirt2001-dev with the new cookbooks (T280641) [admin]
2021-04-23 §
13:49 <dcaro> testing the drain_cloudvirt cookbook on codfw1 openstack cluster, draining cloudvirt2001 (T280641) [admin]
11:12 <dcaro> testing the drain_cloudvirt cookbook on codfw1 openstack cluster (T280641) [admin]
09:32 <dcaro> finished upgrade of ceph cluster on codfw1 using exclusively cookbooks (T280641) [admin]
09:17 <dcaro> testing the upgrade_osds cookbook on codfw1 ceph cluster (T280641) [admin]
08:17 <dcaro> testing the upgrade_mons cookbook on codfw1 ceph cluster (T280641) [admin]
2021-04-21 §
17:59 <dcaro> all monitors upgraded on codfw1 with one cookbook `cookbook --verbose -c ~/.config/spicerack/cookbook.yaml wmcs.ceph.upgrade_mons --monitor-node-fqdn cloudcephmon2002-dev.codfw.wmnet` (T280641) [admin]
17:47 <dcaro> upgrading monitors and mrg nodes on codfw ceph cluster (T280641) [admin]
13:26 <dcaro> testing ceph upgrade cookbook on cloudcephmon2002-dev (T280641) [admin]
2021-04-20 §
20:21 <andrewbogott> reboot cloudservices1003 [admin]
20:13 <andrewbogott> reboot cloudservices1004 [admin]
2021-04-19 §
08:40 <dcaro> enabling puppet on labstore1004 after mysql restart (T279657) [admin]
08:09 <dcaro> downtiming labstore1004 and stopping puppet for mysql restart (T279657) [admin]
2021-04-14 §
10:48 <dcaro> Upgrade of codfw ceph to octopus 15.2.20 done, will run some performance tests now (T274566) [admin]
10:41 <dcaro> Upgrade of codfw ceph to octopus 15.2.20, mgrs upgraded, osds next (T274566) [admin]
10:37 <dcaro> Upgrade of codfw ceph to octopus 15.2.20, mons upgraded, mgrs next (T274566) [admin]
10:15 <dcaro> starting the upgrade of codfw ceph to octopus 15.2.20 (T274566) [admin]
10:07 <dcaro> Merged the ceph 15 (Octopus) repo deployment to codfw, only the repo, not the packages (T274566) [admin]
2021-04-13 §
16:42 <dcaro> Ceph balancer got the cluster to eval 0.014916, that is 88-77% usage for compute pool, and 28-19% usage for the cinder one \o/ (T274573) [admin]
15:08 <dcaro> Activating continuous upmap balancer, keeping a close eye (T274573) [admin]
15:03 <dcaro> Executing a second pass, there's still movements to improve the eval of 0.030075 (T274573) [admin]
15:02 <dcaro> First pass finished, improved eval to 0.030075 (T274573) [admin]