admin SAL

251-300 of 1215 results (13ms)

2021-04-28 §
10:57	<dcaro>	Got a PG getting stuck on 'remapping' after the OSD came up, had to unset the norebalance and then set it again to get it unstuck (T280641)	[admin]
10:34	<dcaro>	Slow/blocked opns from cloudcephmon03, "osd_failure(failed timeout osd.32..." (cloudcephosd1005), unset the cluster noout/norebalance and went away in a few secs, setting it again and continuing... (T280641)	[admin]
09:03	<dcaro>	Waiting for slow heartbeats from osd.58(cloudcephosd1002) to recover... (T280641)	[admin]
08:59	<dcaro>	During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58, currently on cloudcephosd1002 (T280641)	[admin]
08:58	<dcaro>	During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58 (T280641)	[admin]
08:58	<dcaro>	During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) (T280641)	[admin]
08:21	<dcaro>	Upgrading all the ceph osds on eqiad (T280641)	[admin]
08:21	<dcaro>	The clock skew seems intermittent, there's another task to follw it T275860 (T280641)	[admin]
08:18	<dcaro>	All equiad ceph mons and mgrs upgraded (T280641)	[admin]
08:18	<dcaro>	During the upgrade, ceph detected a clock skew on cloudcephmon1002, cloudcephmon1001, they are back (T280641)	[admin]
08:15	<dcaro>	During the upgrade, ceph detected a clock skew on cloudcephmon1002, it went away, I'm guessing systemd-timesyncd fixed it (T280641)	[admin]
08:14	<dcaro>	During the upgrade, ceph detected a clock skew on cloudcephmon1002, looking (T280641)	[admin]
07:58	<dcaro>	Upgrading ceph services on eqiad, starting with mons/managers (T280641)	[admin]
2021-04-27 §
14:10	<dcaro>	codfw.openstack upgraded ceph libraries to 15.2.11 (T280641)	[admin]
13:07	<dcaro>	codfw.openstack cloudvirt2002-dev done, taking cloudvirt2003-dev out to upgrade ceph libraries (T280641)	[admin]
13:00	<dcaro>	codfw.openstack cloudvirt2001-dev back online, taking cloudvirt2002-dev out to upgrade ceph libraries (T280641)	[admin]
10:51	<dcaro>	ceph.eqiad: cinder pool got it's pg_num increased to 1024, re-shuffle started (T273783)	[admin]
10:48	<dcaro>	ceph.eqiad: Tweaked the target_size_ratio of all the pools, enabling autoscaler (it will increase cinder pool only) (T273783)	[admin]
09:14	<dcaro>	manually force stopping the server puppetmaster-01 to unblock migration (in codfw1)	[admin]
09:14	<dcaro>	manually force stopping the server puppetmaster-01 to unblock migration	[admin]
08:59	<dcaro>	manually force stopping the server exploding-head on codfw, to try cold migration	[admin]
08:47	<dcaro>	restarting nova-compute on cloudvirt2001-dev after upgrading ceph libraries to 15.2.11	[admin]
2021-04-26 §
20:56	<andrewbogott>	deleting spurious 'codfw1dev' and 'codw1dev-4' regions in the dallas deployment; regions without endpoints break a bunch of things	[admin]
09:45	<dcaro>	draining cloudvirt2001-dev with the new cookbooks (T280641)	[admin]
2021-04-23 §
13:49	<dcaro>	testing the drain_cloudvirt cookbook on codfw1 openstack cluster, draining cloudvirt2001 (T280641)	[admin]
11:12	<dcaro>	testing the drain_cloudvirt cookbook on codfw1 openstack cluster (T280641)	[admin]
09:32	<dcaro>	finished upgrade of ceph cluster on codfw1 using exclusively cookbooks (T280641)	[admin]
09:17	<dcaro>	testing the upgrade_osds cookbook on codfw1 ceph cluster (T280641)	[admin]
08:17	<dcaro>	testing the upgrade_mons cookbook on codfw1 ceph cluster (T280641)	[admin]
2021-04-21 §
17:59	<dcaro>	all monitors upgraded on codfw1 with one cookbook `cookbook --verbose -c ~/.config/spicerack/cookbook.yaml wmcs.ceph.upgrade_mons --monitor-node-fqdn cloudcephmon2002-dev.codfw.wmnet` (T280641)	[admin]
17:47	<dcaro>	upgrading monitors and mrg nodes on codfw ceph cluster (T280641)	[admin]
13:26	<dcaro>	testing ceph upgrade cookbook on cloudcephmon2002-dev (T280641)	[admin]
2021-04-20 §
20:21	<andrewbogott>	reboot cloudservices1003	[admin]
20:13	<andrewbogott>	reboot cloudservices1004	[admin]
2021-04-19 §
08:40	<dcaro>	enabling puppet on labstore1004 after mysql restart (T279657)	[admin]
08:09	<dcaro>	downtiming labstore1004 and stopping puppet for mysql restart (T279657)	[admin]
2021-04-14 §
10:48	<dcaro>	Upgrade of codfw ceph to octopus 15.2.20 done, will run some performance tests now (T274566)	[admin]
10:41	<dcaro>	Upgrade of codfw ceph to octopus 15.2.20, mgrs upgraded, osds next (T274566)	[admin]
10:37	<dcaro>	Upgrade of codfw ceph to octopus 15.2.20, mons upgraded, mgrs next (T274566)	[admin]
10:15	<dcaro>	starting the upgrade of codfw ceph to octopus 15.2.20 (T274566)	[admin]
10:07	<dcaro>	Merged the ceph 15 (Octopus) repo deployment to codfw, only the repo, not the packages (T274566)	[admin]
2021-04-13 §
16:42	<dcaro>	Ceph balancer got the cluster to eval 0.014916, that is 88-77% usage for compute pool, and 28-19% usage for the cinder one \o/ (T274573)	[admin]
15:08	<dcaro>	Activating continuous upmap balancer, keeping a close eye (T274573)	[admin]
15:03	<dcaro>	Executing a second pass, there's still movements to improve the eval of 0.030075 (T274573)	[admin]
15:02	<dcaro>	First pass finished, improved eval to 0.030075 (T274573)	[admin]
14:49	<dcaro>	Running the first_pass balancing plan on ceph eqiad, current eval 0.030622 (T274573)	[admin]
14:43	<dcaro>	enabling ceph upmap pg balancer on equiad (T274573)	[admin]
14:36	<andrewbogott>	upgrading codfw1dev to version Victoria, T261137	[admin]
13:11	<andrewbogott>	upgrading eqiad1 designate to version Victoria, T261137	[admin]
10:43	<dcaro>	enabled ceph upmap balancer on codfw (T274573,T274573)	[admin]