admin SAL

1-50 of 460 results (17ms)

2020-08-21 §
21:34	<andrewbogott>	restarting nova-compute on cloudvirt1033; it seems stuck	[admin]
2020-08-19 §
14:21	<andrewbogott>	rebooting cloudweb2001-dev, labweb1001, labweb1002 to address mediawiki-induced memleak	[admin]
2020-08-06 §
21:02	<andrewbogott>	removing cloudvirt1004/1006 from nova's list of hypervisors; rebuilding them to use as backup test hosts	[admin]
20:06	<bstorm>	manually stopped the RAID check on cloudcontrol1003 T259760	[admin]
2020-08-04 §
18:54	<bstorm>	restarting mariadb on cloudcontrol1004 to setup parallel replication	[admin]
2020-08-03 §
17:02	<bstorm>	increased db connection limit to 800 across galera cluster because we were clearly hovering at limit	[admin]
2020-07-31 §
19:28	<bd808>	wmcs-novastats-dnsleaks --delete (lots of leaked fullstack-monitoring records to clean up)	[admin]
2020-07-27 §
22:17	<andrewbogott>	ceph osd pool set compute pg_num 2048	[admin]
22:14	<andrewbogott>	ceph osd pool set compute pg_autoscale_mode off	[admin]
2020-07-24 §
19:15	<andrewbogott>	ceph mgr module enable pg_autoscaler	[admin]
19:15	<andrewbogott>	ceph osd pool set compute pg_autoscale_mode on	[admin]
2020-07-22 §
08:55	<jbond42>	[codfw1dev] upgrading hiera to version5	[admin]
08:48	<arturo>	[codfw1dev] add jbond as user in the bastion-codfw1dev and cloudinfra-codfw1dev projects	[admin]
08:45	<arturo>	[codfw1dev] enabled account creation in labtestwiki briefly for jbond42 to create an account	[admin]
2020-07-16 §
10:48	<arturo>	merging change to neutron dmz_cidr https://gerrit.wikimedia.org/r/c/operations/puppet/+/613123 (T257534)	[admin]
2020-07-15 §
23:15	<bd808>	Removed Merlijn van Deen from toollabs-trusted Gerrit group (T255697)	[admin]
11:48	<arturo>	[codfw1dev] created DNS records (A and PTR) for bastion.bastioninfra-codfw1dev.codfw1dev.wmcloud.org <-> 185.15.57.2	[admin]
11:41	<arturo>	[codfw1dev] add myself as projectadmin to the `bastioninfra-codfw1dev` project	[admin]
11:39	<arturo>	[codfw1dev] created DNS zone `bastioninfra-codfw1dev.codfw1dev.wmcloud.org.` in the cloudinfra-codfw1dev project and then transfer ownership to the bastioninfra-codfw1dev project	[admin]
2020-07-14 §
15:19	<arturo>	briefly set root@cloudnet1003:~ # sysctl net.ipv4.conf.all.accept_local=1 (in neutron qrouter netns) (T257534)	[admin]
10:43	<arturo>	icinga downtime cloudnet* hosts for 30 mins to introduce new check https://gerrit.wikimedia.org/r/c/operations/puppet/+/612390 (T257552)	[admin]
04:01	<andrewbogott>	added a wildcard *.wmflabs.org domain pointing at the domain proxy in project-proxy	[admin]
04:00	<andrewbogott>	shortened the ttl on .wmflabs.org. to 300	[admin]
2020-07-13 §
16:17	<arturo>	icinga downtime cloudcontrol[1003-1005].wikimedia.org for 1h for galera database movements	[admin]
2020-07-12 §
17:39	<andrewbogott>	switched eqiad1 keystone from m5 to cloudcontrol galera	[admin]
2020-07-10 §
20:26	<andrewbogott>	disabling nova api to move database to galera	[admin]
2020-07-09 §
11:23	<arturo>	[codfw1dev] rebooting cloudnet2003-dev again for testing sysct/puppet behavior (T257552)	[admin]
11:11	<arturo>	[codfw1dev] rebooting cloudnet2003-dev for testing sysct/puppet behavior (T257552)	[admin]
09:16	<arturo>	manually increasing sysctl value of net.nf_conntrack_max in cloudnet servers (T257552)	[admin]
2020-07-06 §
15:16	<arturo>	installing 'aptitude' in all cloudvirts	[admin]
2020-07-03 §
12:51	<arturo>	[codfw1dev] galera cluster should be up and running, openstack happy (T256283)	[admin]
11:44	<arturo>	[codfw1dev] restoring glance database backup from bacula into cloudcontrol2001-dev (T256283)	[admin]
11:39	<arturo>	[codfw1dev] stopped mysql database in the galera cluster T256283	[admin]
11:36	<arturo>	[codfw1dev] dropped glance database in the galera cluster T256283	[admin]
2020-07-02 §
15:41	<arturo>	`sudo wmcs-openstack --os-compute-api-version 2.55 flavor create --private --vcpus 8 --disk 300 --ram 16384 --property aggregate_instance_extra_specs:ceph=true --description "for packaging envoy" bigdisk-ceph` (T256983)	[admin]
2020-06-29 §
14:24	<arturo>	starting rabbitmq-server in all 3 cloudcontrol servers	[admin]
14:23	<arturo>	stopping rabbitmq-server in all 3 cloudcontrol servers	[admin]
2020-06-18 §
20:38	<andrewbogott>	rebooting cloudservices2003-dev due to a mysterious 'host down' alert on a secondary ip	[admin]
2020-06-16 §
15:38	<arturo>	created by hand neutron port 9c0a9a13-e409-49de-9ba3-bc8ec4801dbf `paws-haproxy-vip` (T295217)	[admin]
2020-06-12 §
13:23	<arturo>	DNS zone `paws.wmcloud.org` transferred to the PAWS project (T195217)	[admin]
13:20	<arturo>	created DNS zone `paws.wmcloud.org` (T195217)	[admin]
2020-06-11 §
19:19	<bstorm_>	proceeding with failback to labstore1004 now that DRBD devices are consistent T224582	[admin]
17:22	<bstorm_>	delaying failback labstore1004 for drive syncs T224582	[admin]
17:17	<bstorm_>	failing NFS back to labstore1004 to complete the upgrade process T224582	[admin]
16:15	<bstorm_>	failing over NFS for labstore1004 to labstore1005 T224582	[admin]
2020-06-10 §
16:09	<andrewbogott>	deleting all old cloud-ns0.wikimedia.org and cloud-ns1.wikimedia.org ns records in designate database T254496	[admin]
2020-06-09 §
15:25	<arturo>	icinga downtime everything cloud* lab* for 2h more (T253780)	[admin]
14:09	<andrewbogott>	stopping puppet, all designate services and all pdns services on cloudservices1004 for T253780	[admin]
14:01	<arturo>	icinga downtime everything cloud* lab* for 2h (T253780)	[admin]
2020-06-05 §
15:08	<andrewbogott>	trying to re-enable puppet without losing cumin contact, as per https://phabricator.wikimedia.org/T254589	[admin]