30 results (18ms)
2024-06-24 §
20:09 <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1) [metricsinfra]
19:56 <andrew@cloudcumin1001> START - Cookbook wmcs.openstack.migrate_project_to_ovs [metricsinfra]
2024-03-13 §
12:14 <taavi> MariaDB [prometheusconfig]> delete from alerts where name = 'GridQueueProblem'; # T314664 [metricsinfra]
2023-11-30 §
18:53 <taavi> no longer send quarry alerts to cloud services team [metricsinfra]
2023-11-18 §
14:09 <taavi> reboot metricsinfra-alertmanager-1 to see if it stops flapping a puppet alert [metricsinfra]
2023-09-29 §
08:24 <wm-bot2> dcaro@urcuchillay END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0) [metricsinfra]
08:17 <wm-bot2> dcaro@urcuchillay START - Cookbook wmcs.openstack.cloudvirt.vm_console [metricsinfra]
08:17 <wm-bot2> dcaro@urcuchillay END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0) [metricsinfra]
08:16 <wm-bot2> dcaro@urcuchillay START - Cookbook wmcs.openstack.cloudvirt.vm_console [metricsinfra]
2023-05-10 §
17:17 <wm-bot2> Increased quotas by 8 cores, 16384 ram (T336423) - cookbook ran by taavi@runko [metricsinfra]
2023-05-04 §
15:11 <dcaro> rebooting metricsinfra-prometheus-2 as it was unresponsive [metricsinfra]
2023-04-24 §
14:16 <dcaro> rebooting metricsinfra-prometheus-2, it's in a non-responsive state (no ssh, console hangs) [metricsinfra]
2023-04-21 §
21:58 <andrewbogott> added raymond-ndibe as project member [metricsinfra]
2023-03-07 §
16:31 <wm-bot2> removed instance metricsinfra-controller-1 - cookbook ran by dcaro@vulcanus [metricsinfra]
2023-02-13 §
23:37 <bd808> metricsinfra-db-1.trove.eqiad1.wikimedia.cloud restarted via Horizon [metricsinfra]
23:35 <bd808> metricsinfra-db-1.trove.eqiad1.wikimedia.cloud not responsive to ssh [metricsinfra]
23:32 <bd808> grafana.wmcloud.org offline with db connection error. Investigating. [metricsinfra]
2022-12-20 §
15:59 <dcaro> rebooting prometheus-2 due to being non-responsive [metricsinfra]
2022-06-16 §
14:18 <taavi> add 'gitlab-runners' project to list of scraped projects [metricsinfra]
2022-03-01 §
11:38 <dcaro> Reloading alertmanager to refresh new config (T302702) [metricsinfra]
11:37 <dcaro> Adding runbook url annotation to GridQueueProblem alert on DB at metricsinfra-crontroller-1 (T302702) [metricsinfra]
2022-01-22 §
11:32 <taavi> added project-proxy VMs to prometheus targets [metricsinfra]
2021-12-14 §
09:27 <majavah> drop "analytics" project from current beta coverage, the setup is currently not compatible with pontoon [metricsinfra]
2021-09-11 §
08:41 <majavah> silence deployment-prep alerts yet again [metricsinfra]
2021-07-12 §
15:45 <bstorm> silenced deployment prep alerts for another 60 days [metricsinfra]
2021-06-15 §
16:12 <balloons> add 8 CPU/16G RAM to quota T284973 [metricsinfra]
2021-06-14 §
18:40 <balloons> Add majavah as projectadmin T284938 [metricsinfra]
2021-03-11 §
18:33 <bstorm> silenced alerts from deploymentprep for another 60 days [metricsinfra]
2021-01-04 §
15:50 <bstorm> silencing all alerts from deployment-prep for 60 more days [metricsinfra]
2020-09-29 §
16:53 <bstorm> silence all the deployment-prep alerts for another 30 days [metricsinfra]