251-300 of 403 results (27ms)
2020-12-05 §
00:42 <bd808> `kubectl delete po renderer-794886b9cd-9nc6c -n prod` after seeing lots of listen queue full errors in the pod logs. [paws]
2020-11-30 §
18:22 <bstorm> 1.17 upgrade for kubernetes complete T268669 [paws]
17:25 <bstorm> upgrading the worker nodes (this will likely kill services briefly when some pods are rescheduled) T268669 [paws]
17:14 <bstorm> updated the calico-kube-controllers deployment to use our internal registry to deal with docker-hub rate-limiting T268669 T269016 [paws]
17:08 <chicocvenancio> delete orphaned jupyter server pod `kubectl -n prod delete pod jupyter--45volutionoftheuniverse`. Respective server not running in jupyter admin UI. [paws]
16:31 <bstorm> upgrading pods on paws-k8s-control-3 T268669 [paws]
16:17 <bstorm> starting upgrade on paws-k8s-control-2 T268669 (first kubectl drain paws-k8s-control-2 --ignore-daemonsets) [paws]
15:53 <bstorm> proceeding with upgrade to 1.17 on paws-k8s-control-1 T268669 [paws]
15:49 <bstorm> draining paws-k8s-control-1 for upgrade T268669 [paws]
12:49 <arturo> disable puppet in all k8s nodes to prepare for the upgrade (T268669) [paws]
12:49 <arturo> set hiera `profile::wmcs::kubeadm::component: 'thirdparty/kubeadm-k8s-1-17'` at project level (T268669) [paws]
2020-11-16 §
22:13 <bstorm> deploying new paws changes for multiinstance readiness [paws]
2020-11-10 §
20:16 <chicocvenancio> restart hub to apply move to sqlite. T267667 [paws]
16:41 <arturo> set paws in sqlite mode because T266587 (kubectl --namespace prod edit configmap hub-config) [paws]
2020-10-15 §
19:12 <andrewbogott> uncordoned paws-k8s-worker-1 and -2 [paws]
18:48 <andrewbogott> draining paws-k8s-worker-2 for move to ceph [paws]
18:36 <andrewbogott> draining paws-k8s-worker-1 for move to ceph [paws]
2020-09-29 §
10:59 <arturo> last 2 commands should help puppet agent in the paws project, previously it had issues fetching acme-chief certs because an API update [paws]
10:58 <arturo> aborrero@paws-acme-chief-01:~$ sudo systemctl restart uwsgi-acme-chief.service [paws]
10:56 <arturo> aborrero@paws-acme-chief-01:~$ sudo systemctl restart acme-chief.service [paws]
2020-08-14 §
17:09 <bstorm> backing up the old proxy config to NFS and deleting paws-proxy-02 T211096 [paws]
2020-08-07 §
22:30 <bstorm> removing downtime for paws and front page monitor T211096 [paws]
18:01 <bstorm> shutting down paws-proxy-02 T211096 [paws]
17:05 <bstorm> running the final rsync to the new cluster's nfs T211096 [paws]
16:08 <bstorm> changing paws.wmflabs.org to point at the new cluster ip 185.15.56.57 T211096 [paws]
16:02 <bstorm> LAST MESSAGE WRONG: switching NEW cluster to toolsdb T211096 [paws]
16:02 <bstorm> switching old cluster to toolsdb T211096 [paws]
15:58 <bstorm> switching old cluster to sqlite T211096 [paws]
15:53 <bstorm> downtiming alerts in case they need changes (seems likely) T211096 [paws]
2020-07-30 §
20:40 <bstorm> upgrading the singleuser image to test shuffling around some of the pip installs [paws]
16:38 <bstorm> removing the *.paws.wmflabs.org SNI name because it won't be used and it might trigger a re-issue of certs T255249 [paws]
15:39 <bstorm> upgrading acme-chief to 0.27-1 [paws]
2020-07-29 §
18:03 <bstorm> powering on paws-k8s-haproxy-1 because that worked fine [paws]
18:00 <bstorm> powering off paws-k8s-haproxy-1 to test failover [paws]
2020-07-24 §
17:25 <bstorm> to force repulling of every image everywhere, uninstalling paws in the new cluster and reinstalling it T258812 [paws]
09:39 <arturo> dropped the DNS wildcard record `*.paws.wmcloud.org IN A 185.15.56.57` and created concrete CNAME records for the FQDNs we actually use (T211096) [paws]
2020-07-23 §
22:51 <bstorm> deploying via the default 'latest' tag in the new cluster T211096 [paws]
22:48 <bstorm> tagged the newbuild tags with "latest" to set sane defaults for all images in the helm chart T211096 [paws]
21:14 <bstorm> pushing quay.io/wikimedia-paws-prod/nbserve:newbuild to main repo T211096 [paws]
21:11 <bstorm> pushing quay.io/wikimedia-paws-prod/deploy-hook:newbuild to main repo T211096 [paws]
21:09 <bstorm> pushing quay.io/wikimedia-paws-prod/singleuser:newbuild to the main repo T211096 [paws]
21:08 <bstorm> pushing quay.io/wikimedia-paws-prod/paws-hub:newbuild to the main repo T211096 [paws]
21:06 <bstorm> pushing dbproxy docker image for new cluster into main quay.io repo T211096 [paws]
2020-07-22 §
23:32 <bstorm> setting the default NFS version to 4.2 while excepting the two stretch servers T257945 [paws]
2020-07-21 §
15:13 <chicocvenancio> merge pr #50 to fix T258142 [paws]
2020-07-06 §
21:41 <bstorm> deployed ingress to redirect paws.wmcloud.org to the wikitech doc page T195217 [paws]
2020-06-30 §
23:00 <bstorm> added paws-public.wmflabs.org to the alt-names for acme-chief, which broke it until we hand off the zone to the paws project <sorry!> T195217 T255997 [paws]
2020-06-26 §
21:57 <bstorm> applied the metrics manifests to kubernetes to enable metrics-server, cadvisor, etc. T256361 [paws]
2020-06-25 §
22:52 <bstorm> created paws-k8s-worker-5/6/7 as x-large nodes to bring the cluster up to roughly the same capacity as the existing one using soft anti-affinity T211096 T253267 [paws]
22:43 <bstorm> bumped quota up to 24 instances, 128 GB RAM and 56 cores T211096 [paws]