51-100 of 3518 results (16ms)
2022-06-03 §
12:46 <taavi> start webservicemonitor on tools-sgecron-01 T309821 [tools]
10:36 <taavi> draining each sgeweblight node one by one, and removing the jobs stuck in 'deleting' too [tools]
05:05 <taavi> removing duplicate (there should be only one per tool) web service jobs from the grid T309821 [tools]
04:52 <taavi> revert bd808's changes to profile::toolforge::active_proxy_host [tools]
03:21 <bd808> Cleared queue error states after deploying new toolforge-webservice package (T309821) [tools]
03:10 <bd808> publish tools-webservice 0.85 with hack for T309821 [tools]
2022-06-02 §
22:26 <bd808> Rebooting tools-sgeweblight-10-1.tools.eqiad1.wikimedia.cloud. Node is full of jobs that are not tracked by grid master and failing to spawn new jobs sent by the scheduler [tools]
21:56 <bd808> Removed legacy "active_proxy_host" hiera setting [tools]
21:55 <bd808> Updated hiera to use fqdn of 'tools-proxy-06.tools.eqiad1.wikimedia.cloud' for profile::toolforge::active_proxy_host key [tools]
21:41 <bd808> Updated hiera to use fqdn of 'tools-proxy-06.tools.eqiad1.wikimedia.cloud' for active_redis key [tools]
21:22 <wm-bot2> created node tools-sgeweblight-10-8.tools.eqiad1.wikimedia.cloud and added it to the grid - cookbook ran by taavi@runko [tools]
12:42 <wm-bot2> rebooting stretch exec grid workers - cookbook ran by taavi@runko [tools]
12:13 <wm-bot2> created node tools-sgeweblight-10-7.tools.eqiad1.wikimedia.cloud and added it to the grid - cookbook ran by taavi@runko [tools]
12:03 <dcaro> refresh prometheus certs (T308402) [tools]
11:47 <dcaro> refresh registry-admission-controller certs (T308402) [tools]
11:42 <dcaro> refresh ingress-admission-controller certs (T308402) [tools]
11:36 <dcaro> refresh volume-admission-controller certs (T308402) [tools]
11:24 <wm-bot2> created node tools-sgeweblight-10-6.tools.eqiad1.wikimedia.cloud and added it to the grid - cookbook ran by taavi@runko [tools]
11:17 <taavi> publish jobutils 1.44 that updates the grid default from stretch to buster T277653 [tools]
10:16 <taavi> publish tools-webservice 0.84 that updates the grid default from stretch to buster T277653 [tools]
09:54 <wm-bot2> created node tools-sgeexec-10-14.tools.eqiad1.wikimedia.cloud and added it to the grid - cookbook ran by taavi@runko [tools]
2022-06-01 §
11:18 <taavi> depool and remove tools-sgeexec-09[07-14] [tools]
2022-05-31 §
16:51 <taavi> delete tools-sgeexec-0904 for T309525 experimentation [tools]
2022-05-30 §
08:24 <taavi> depool tools-sgeexec-[0901-0909] (7 nodes total) T277653 [tools]
2022-05-26 §
15:39 <wm-bot2> deployed kubernetes component https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (e6fa299) (T309146) - cookbook ran by taavi@runko [tools]
2022-05-22 §
17:04 <taavi> failover tools-redis to the updated cluster T278541 [tools]
16:42 <wm-bot2> removing grid node tools-sgeexec-0940.tools.eqiad1.wikimedia.cloud (T308982) - cookbook ran by taavi@runko [tools]
2022-05-16 §
14:02 <wm-bot2> deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx (7037eca) - cookbook ran by taavi@runko [tools]
2022-05-14 §
10:47 <taavi> hard reboot unresponsible tools-sgeexec-0940 [tools]
2022-05-12 §
12:36 <taavi> re-enable CronJobControllerV2 T308205 [tools]
09:28 <taavi> deploy jobs-api update T308204 [tools]
09:15 <wm-bot2> build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (e6fa299) (T308204) - cookbook ran by taavi@runko [tools]
2022-05-10 §
15:18 <taavi> depool tools-k8s-worker-42 for experiments [tools]
13:54 <taavi> enable distro-wikimedia unattended upgrades T290494 [tools]
2022-05-06 §
19:46 <bd808> Rebuilt toolforge-perl532-sssd-base & toolforge-perl532-sssd-web to add liblocale-codes-perl (T307812) [tools]
2022-05-05 §
17:28 <taavi> deploy tools-webservice 0.83 T307693 [tools]
2022-05-03 §
08:20 <taavi> redis: start replication from the old cluster to the new one (T278541) [tools]
2022-05-02 §
08:54 <taavi> restart acme-chief.service T307333 [tools]
2022-04-25 §
14:56 <bd808> Rebuilding all docker images to pick up toolforge-webservice v0.82 (T214343) [tools]
14:46 <bd808> Building toolforge-webservice v0.82 [tools]
2022-04-23 §
16:51 <bd808> Built new perl532-sssd/{base,web} images and pushed to registry (T214343) [tools]
2022-04-20 §
16:58 <taavi> reboot toolserver-proxy-01 to free up disk space from stale file handles(?) [tools]
07:51 <wm-bot> build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (8f37a04) - cookbook ran by taavi@runko [tools]
2022-04-16 §
18:53 <wm-bot> deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/kubernetes-metrics (2c485e9) - cookbook ran by taavi@runko [tools]
2022-04-12 §
21:32 <bd808> Added komla to Gerrit group 'toollabs-trusted' (T305986) [tools]
21:27 <bd808> Added komla to 'roots' sudoers policy (T305986) [tools]
21:24 <bd808> Add komla as projectadmin (T305986) [tools]
2022-04-10 §
18:43 <taavi> deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space) [tools]
2022-04-09 §
15:30 <taavi> manually prune user.log on tools-prometheus-03 to free up some space on / [tools]
2022-04-08 §
10:44 <arturo> disabled debug mode on the k8s jobs-emailer component [tools]