3001-3050 of 7598 results (35ms)
2024-01-09 §
23:37 <andrewbogott> restarting harbor-db in an attempt to reform harbor -- T354714 [tools]
23:30 <andrewbogott> rebooting tools-harbor-1 in a feeble attempt to get it to work (docker-compose can't restart it) [tools]
23:12 <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=99) for component builds-builder [tools]
23:12 <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [tools]
23:11 <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=99) for component builds.builder [tools]
23:11 <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds.builder [tools]
17:31 <wm-bot2> dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [tools]
17:30 <wm-bot2> dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [tools]
10:13 <taavi> reboot tools-sgeexec-10-17 due to high load [tools]
2024-01-08 §
12:26 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.remove_grid_node for tools-sgeweblight-10-27, tools-sgeweblight-10-28 [tools]
10:51 <taavi@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component jobs-api [tools]
10:51 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.component.deploy for component jobs-api [tools]
10:17 <taavi> reboot tools-sgeexec-10-21 [tools]
2024-01-05 §
14:55 <wm-bot2> dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [tools]
14:55 <wm-bot2> dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [tools]
11:56 <wm-bot2> dcaro@urcuchillay END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-builder [tools]
11:55 <wm-bot2> dcaro@urcuchillay START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-builder [tools]
10:29 <fnegri@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.grid.cleanup_queue_errors (exit_code=0) [tools]
10:29 <fnegri@cloudcumin1001> START - Cookbook wmcs.toolforge.grid.cleanup_queue_errors [tools]
2024-01-04 §
10:11 <dcaro> deploy toolforge-envvars-cli 0.0.3 [tools]
2024-01-03 §
21:22 <andrewbogott> truncating 200 logfiles to 5M on tools nfs [tools]
21:17 <andrewbogott> deleting many stray core dumps throughout nfs storage [tools]
2024-01-02 §
11:06 <dcaro> restart toolsdb database to flush connections (T354176) [tools]
10:42 <dcaro> flushed the redis db on tools-harbor-1 (T354176) [tools]
10:37 <dcaro> hard reboot tools-harbor-1 [tools]
10:13 <dhinus> hard reboot tools-harbor-1 [tools]
2024-01-01 §
15:54 <andrewbogott> rebooting tools-harbor-1, T354151 [tools]
2023-12-30 §
12:43 <taavi@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.grid.cleanup_queue_errors (exit_code=0) [tools]
12:43 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.grid.cleanup_queue_errors [tools]
2023-12-29 §
21:39 <andrewbogott> rebooting tools-sgeweblight-10-28.tools.eqiad1.wikimedia.cloud because previous reset didn't get the queue out of error state [tools]
19:31 <andrewbogott> restarting sge_execd on tools-sgeweblight-10-28.tools.eqiad1.wikimedia.cloud in response to error state alert [tools]
2023-12-28 §
21:03 <andrewbogott> "docker-compose restart" on tools-harbor-1 [tools]
19:18 <andrewbogott> rebooting tools-harbor-1.tools.eqiad1.wikimedia.cloud, unresponsive [tools]
2023-12-23 §
18:24 <taavi@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api [tools]
18:24 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api [tools]
2023-12-21 §
15:48 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.remove_grid_node for tools-sgeexec-10-16 [tools]
2023-12-20 §
11:22 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.remove_grid_node for tools-sgeexec-10-14, tools-sgeexec-10-15, tools-sgeweblight-10-18, tools-sgeweblight-10-24 [tools]
10:01 <taavi> rebooting tools-sgeweblight-10-18, -24, -25, to get rid of a large number of jobs in deleting status [tools]
2023-12-19 §
15:39 <dhinus> restarting toolsdb to apply a config change T353093 [tools]
13:18 <taavi@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component builds-api [tools]
13:17 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.component.deploy for component builds-api [tools]
2023-12-18 §
16:15 <taavi> reboot tools-sgeexec-10-15, -23 due to stuck NFS processes [tools]
14:43 <taavi@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [tools]
14:42 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [tools]
14:40 <taavi@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.component.deploy (exit_code=0) for component maintain-kubeusers [tools]
14:39 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.component.deploy for component maintain-kubeusers [tools]
2023-12-16 §
22:01 <taavi@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.grid.cleanup_queue_errors (exit_code=0) [tools]
22:01 <taavi@cloudcumin1001> START - Cookbook wmcs.toolforge.grid.cleanup_queue_errors [tools]
20:54 <bd808> Rebuilding all containers to pick up lighttpd config fix and normal package updates (T293552) [tools]
08:14 <dhinus> restarting toolsdb with jemalloc [tools]