2021-04-28
§
|
21:11 |
<andrewbogott> |
cleaning up more references to deleted hypervisors with delete from services where topic='compute' and version != 53; |
[admin] |
20:48 |
<andrewbogott> |
cleaning up references to deleted hypervisors with mysql:root@localhost [nova_eqiad1]> delete from compute_nodes where hypervisor_version != '5002000'; |
[admin] |
19:40 |
<andrewbogott> |
putting cloudvirt1040 into the maintenance aggregate pending more info about T281399 |
[admin] |
18:11 |
<andrewbogott> |
adding cloudvirt1040, 1041 and 1042 to the 'ceph' host aggregate -- T275081 |
[admin] |
11:06 |
<dcaro> |
All ceph server side upgraded to Octopus! \o/ (T280641) |
[admin] |
10:57 |
<dcaro> |
Got a PG getting stuck on 'remapping' after the OSD came up, had to unset the norebalance and then set it again to get it unstuck (T280641) |
[admin] |
10:34 |
<dcaro> |
Slow/blocked opns from cloudcephmon03, "osd_failure(failed timeout osd.32..." (cloudcephosd1005), unset the cluster noout/norebalance and went away in a few secs, setting it again and continuing... (T280641) |
[admin] |
09:03 |
<dcaro> |
Waiting for slow heartbeats from osd.58(cloudcephosd1002) to recover... (T280641) |
[admin] |
08:59 |
<dcaro> |
During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58, currently on cloudcephosd1002 (T280641) |
[admin] |
08:58 |
<dcaro> |
During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58 (T280641) |
[admin] |
08:58 |
<dcaro> |
During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) (T280641) |
[admin] |
08:21 |
<dcaro> |
Upgrading all the ceph osds on eqiad (T280641) |
[admin] |
08:21 |
<dcaro> |
The clock skew seems intermittent, there's another task to follw it T275860 (T280641) |
[admin] |
08:18 |
<dcaro> |
All equiad ceph mons and mgrs upgraded (T280641) |
[admin] |
08:18 |
<dcaro> |
During the upgrade, ceph detected a clock skew on cloudcephmon1002, cloudcephmon1001, they are back (T280641) |
[admin] |
08:15 |
<dcaro> |
During the upgrade, ceph detected a clock skew on cloudcephmon1002, it went away, I'm guessing systemd-timesyncd fixed it (T280641) |
[admin] |
08:14 |
<dcaro> |
During the upgrade, ceph detected a clock skew on cloudcephmon1002, looking (T280641) |
[admin] |
07:58 |
<dcaro> |
Upgrading ceph services on eqiad, starting with mons/managers (T280641) |
[admin] |
2021-04-27
§
|
14:10 |
<dcaro> |
codfw.openstack upgraded ceph libraries to 15.2.11 (T280641) |
[admin] |
13:07 |
<dcaro> |
codfw.openstack cloudvirt2002-dev done, taking cloudvirt2003-dev out to upgrade ceph libraries (T280641) |
[admin] |
13:00 |
<dcaro> |
codfw.openstack cloudvirt2001-dev back online, taking cloudvirt2002-dev out to upgrade ceph libraries (T280641) |
[admin] |
10:51 |
<dcaro> |
ceph.eqiad: cinder pool got it's pg_num increased to 1024, re-shuffle started (T273783) |
[admin] |
10:48 |
<dcaro> |
ceph.eqiad: Tweaked the target_size_ratio of all the pools, enabling autoscaler (it will increase cinder pool only) (T273783) |
[admin] |
09:14 |
<dcaro> |
manually force stopping the server puppetmaster-01 to unblock migration (in codfw1) |
[admin] |
09:14 |
<dcaro> |
manually force stopping the server puppetmaster-01 to unblock migration |
[admin] |
08:59 |
<dcaro> |
manually force stopping the server exploding-head on codfw, to try cold migration |
[admin] |
08:47 |
<dcaro> |
restarting nova-compute on cloudvirt2001-dev after upgrading ceph libraries to 15.2.11 |
[admin] |