admin SAL

551-600 of 3097 results (28ms)

2022-08-16 §
22:39	<andrewbogott>	replacing the now-rebuilt cloudvirt1025 in 'ceph' aggregate and removing it from the 'maintenance' aggregate	[admin]
17:41	<andrewbogott>	removing cloudvirt1025 from the 'ceph' aggregate and adding it to the 'maintenance' aggregate	[admin]
17:40	<andrewbogott>	reimaging cloudvirt1025 after I accidentally deleted the hw raid	[admin]
17:38	<andrewbogott>	root@cloudcontrol1005:~# cinder-manage volume update_host --currenthost cloudcontrol1003@rbd#RBD --newhost cloudcontrol1005@rbd#RBD	[admin]
17:37	<andrewbogott>	root@cloudcontrol1005:~# cinder-manage volume update_host --currenthost cloudcontrol1004@rbd#RBD --newhost cloudcontrol1006@rbd#RBD	[admin]
16:26	<wm-bot2>	Ceph cluster at eqiad1 set out of maintenance. - cookbook ran by dcaro@vulcanus	[admin]
15:43	<wm-bot2>	Restarting the osd daemons from nodes cloudcephosd1001,cloudcephosd1002,cloudcephosd1003,cloudcephosd1004,cloudcephosd1005,cloudcephosd1006,cloudcephosd1007,cloudcephosd1008,cloudcephosd1009,cloudcephosd1010,cloudcephosd1011,cloudcephosd1012,cloudcephosd1013,cloudcephosd1014,cloudcephosd1015,cloudcephosd1016,cloudcephosd1017,cloudcephosd1018,cloudcephosd1019,cloudcephosd1020,cloudcephosd1021,cloudcephosd1022,cloudcephosd1023,c	[admin]
15:42	<wm-bot2>	Finished restarting all the OSD daemons from the nodes ['cloudcephosd2001-dev', 'cloudcephosd2002-dev', 'cloudcephosd2003-dev'] - cookbook ran by dcaro@vulcanus	[admin]
15:38	<wm-bot2>	Restarting the osd daemons from nodes cloudcephosd2001-dev,cloudcephosd2002-dev,cloudcephosd2003-dev - cookbook ran by dcaro@vulcanus	[admin]
13:08	<wm-bot2>	Restarting the osd daemons from nodes cloudcephosd2001-dev,cloudcephosd2002-dev,cloudcephosd2003-dev - cookbook ran by dcaro@vulcanus	[admin]
13:07	<wm-bot2>	Restarting the osd daemons from nodes cloudcephosd2001-dev,cloudcephosd2002-dev,cloudcephosd2003-dev - cookbook ran by dcaro@vulcanus	[admin]
13:02	<wm-bot2>	Restarting the osd daemons from nodes cloudcephosd2001-dev,cloudcephosd2002-dev,cloudcephosd2003-dev - cookbook ran by dcaro@vulcanus	[admin]
13:01	<wm-bot2>	Restarting the osd daemons from nodes cloudcephosd2001-dev,cloudcephosd2002-dev,cloudcephosd2003-dev - cookbook ran by dcaro@vulcanus	[admin]
12:59	<wm-bot2>	Restarting the osd daemons from nodes cloudcephosd2001-dev,cloudcephosd2002-dev,cloudcephosd2003-dev - cookbook ran by dcaro@vulcanus	[admin]
2022-08-14 §
18:36	<taavi>	deleted the http keystone endpoints from the keystone service catalog	[admin]
2022-08-11 §
13:57	<andrewbogott>	decommissioning cloudcontrol1003 + cloudcontrl1004. I backed up $home in case anyone needs their files.	[admin]
08:42	<wm-bot2>	The cluster is now rebalanced after adding the new OSDs ['cloudcephosd1025.eqiad.wmnet'] (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
08:42	<wm-bot2>	Added 1 new OSDs ['cloudcephosd1025.eqiad.wmnet'] (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
08:42	<wm-bot2>	Added OSD cloudcephosd1025.eqiad.wmnet... (1/1) (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
08:40	<wm-bot2>	Finished rebooting node cloudcephosd1025.eqiad.wmnet (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
08:36	<wm-bot2>	Rebooting node cloudcephosd1025.eqiad.wmnet (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
08:36	<wm-bot2>	Adding OSD cloudcephosd1025.eqiad.wmnet... (1/1) (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
08:36	<wm-bot2>	Adding new OSDs ['cloudcephosd1025.eqiad.wmnet'] to the cluster (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
2022-08-10 §
13:10	<wm-bot2>	Finished rebooting node cloudcephosd1025.eqiad.wmnet (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
13:06	<wm-bot2>	Rebooting node cloudcephosd1025.eqiad.wmnet (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
13:06	<wm-bot2>	Adding OSD cloudcephosd1025.eqiad.wmnet... (1/1) (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
13:06	<wm-bot2>	Adding new OSDs ['cloudcephosd1025.eqiad.wmnet'] to the cluster (T314870) - cookbook ran by fran@MacBook-Pro.station	[admin]
2022-08-04 §
17:16	<taavi>	deleted all scheduler_fanout_ rabbit queues in an attempt to fix scheduling	[admin]
16:32	<taavi>	restart neutron-l3-agent to pick up rabbit config changes	[admin]
15:12	<andrewbogott>	stopping rabbitmq on cloudcontrol1xxx	[admin]
09:57	<taavi>	stop wikitech_run_jobs.timer on labweb1001/1002, hosts pending decom	[admin]
2022-08-03 §
20:55	<andrewbogott>	root@tools-checker-04:~# systemctl restart uwsgi-toolschecker_cron.service	[admin]
20:41	<andrewbogott>	restarting neutron-l3-agent.service on cloudnet1003 and 1004. The agent was routing properly but had lost touch with rabbitmq	[admin]
2022-08-02 §
14:07	<andrewbogott>	shutting down codfw1dev ceph cluster according to https://docs.mirantis.com/mcp/q4-18/mcp-operations-guide/scheduled-maintenance-power-outage/power-off-ceph-cluster.html	[admin]
13:54	<andrewbogott>	shutting down basically all of codfw1dev to support pdu maintenance -- all the ceph OSDs will lose power so best to have everything stopped.	[admin]
2022-07-27 §
19:32	<andrewbogott>	switching the openstack.eqiad1.wikimedia.cloud endpoint from cloudcontrol1004 to 1006, https://gerrit.wikimedia.org/r/c/operations/dns/+/817878/2/templates/wikimediacloud.org#54	[admin]
16:33	<andrewbogott>	here is a test message in the admin channel	[admin]
2022-07-25 §
13:43	<andrewbogott>	pooling cloudweb100[34] and depooling labweb100[12] for testing in prep for decomming labweb100[12]	[admin]
2022-07-22 §
16:41	<taavi>	depool cloudweb1003/1004 since horizon seems to be having issues	[admin]
16:22	<taavi>	pooling cloudweb1003/1004 now that grant issues are sorted	[admin]
2022-07-21 §
18:26	<andrewbogott>	depooling cloudweb1003 and 1004 for wikitech, horizon, striker -- pending db grant changes	[admin]
18:06	<andrewbogott>	pooling cloudweb1003 and 1004 for wikitech, horizon, striker	[admin]
2022-07-20 §
18:02	<dcaro>	things seem stable, trying to bring up a the last rabbit node, cloudcontrol1007 (T313400)	[admin]
17:45	<bd808>	`sudo service striker restart` on labweb1002	[admin]
17:43	<bd808>	`sudo service striker restart` on labweb1001	[admin]
17:10	<dcaro>	things seem stable, trying to bring up a fourth rabbit node, cloudcontrol1006 (T313400)	[admin]
16:26	<dcaro>	things seem stable, trying to bring up a third, cloudcontrol1005 (T313400)	[admin]
15:51	<dcaro>	things seem stable now with one rabbit node, trying to bring up a second (T313400)	[admin]
14:16	<dcaro>	stopping rabbin on cloudcontrol1004, leaving only 1003 alive (T313400)	[admin]
13:17	<dcaro>	restarting the whole rabbit cluster (T313400)	[admin]