1-50 of 1645 results (8ms)
2019-03-13 §
23:30 <bd808> Rebuilding stretch Kubernetes images [tools]
22:55 <bd808> Rebuilding jessie Kubernetes images [tools]
17:11 <bstorm_> specifically rebooted SGE cron server tools-sgecron-01 [tools]
17:10 <bstorm_> rebooted cron server [tools]
16:10 <bd808> Updated DNS for dev.tools.wmflabs.org to point to Stretch secondary bastion. This was missed on 2019-03-07 [tools]
12:33 <arturo> reboot tools-sgebastion-08 (T215154) [tools]
12:17 <arturo> reboot tools-sgebastion-07 (T215154) [tools]
11:53 <arturo> enable puppet in tools-sgebastion-07 (T215154) [tools]
11:20 <arturo> disable puppet in tools-sgebastion-07 for testing T215154 [tools]
05:07 <bstorm_> re-enabled puppet for tools-sgebastion-07 [tools]
04:59 <bstorm_> disabled puppet for a little bit on tools-bastion-07 [tools]
00:22 <bd808> Raise web-memlimit for isbn tool to 6G for tomcat8 (T217406) [tools]
2019-03-11 §
15:53 <bd808> Manually started `service gridengine-master` on tools-sgegrid-master after reboot (T218038) [tools]
15:47 <bd808> Hard reboot of tools-sgegrid-master via Horizon UI (T218038) [tools]
15:42 <bd808> Rebooting tools-sgegrid-master (T218038) [tools]
14:49 <gtirloni> deleted tools-webgrid-lighttpd-1419 [tools]
00:53 <bd808> Re-enabled 13 queue instances that had been disabled by LDAP failures during job initialization (T217280) [tools]
2019-03-10 §
22:36 <gtirloni> increased nscd group TTL from 60 to 300sec [tools]
2019-03-08 §
19:48 <andrewbogott> repooling tools-exec-1430 and tools-sgeexec-0905 to compare ldap usage [tools]
19:21 <andrewbogott> depooling tools-exec-1430 and tools-sgeexec-0905 to compare ldap usage [tools]
17:49 <bd808> Re-enabled 4 queue instances that had been disabled by LDAP failures during job initialization (T217280) [tools]
00:30 <bd808> DNS record created for trusty-dev.tools.wmflabs.org (Trusty secondary bastion) [tools]
2019-03-07 §
23:31 <bd808> Updated DNS to point login.tools.wmflabs.org at 185.15.56.48 (Stretch bastion) [tools]
04:15 <bd808> Killed 3 orphan processes on Trusty grid [tools]
04:01 <bd808> Cleared error state on a large number of Stretch grid queues which had been disabled by LDAP and/or NFS hiccups (T217280) [tools]
00:49 <zhuyifei1999_> clushed misctools 1.37 upgrade on @bastion,@cron,@bastion-stretch T217406 [tools]
00:38 <zhuyifei1999_> published misctools 1.37 T217406 [tools]
00:34 <zhuyifei1999_> begin building misctools 1.37 using debuild T217406 [tools]
2019-03-06 §
13:57 <gtirloni> fixed SSH warnings in tools-clushmaster-02 [tools]
2019-03-04 §
19:07 <bstorm_> umounted /mnt/nfs/dumps-labstore1006.wikimedia.org for T217473 [tools]
14:05 <gtirloni> rebooted tools-docker-registry-{03,04}, tools-puppetmaster-02 and tools-puppetdb-01 (load avg >45, not accessible) [tools]
2019-03-03 §
20:54 <andrewbogott> cleaning out /tmp on tools-exec-1412 [tools]
2019-02-28 §
19:36 <zhuyifei1999_> built with debuild instead T217297 [tools]
19:08 <zhuyifei1999_> test failures during build, see ticket [tools]
18:55 <zhuyifei1999_> start building jobutils 1.36 T217297 [tools]
2019-02-27 §
20:41 <andrewbogott> restarting nginx on tools-checker-01 [tools]
19:34 <andrewbogott> uncordoning tools-worker-1028, 1002 and 1005, now in eqiad1-r [tools]
16:20 <zhuyifei1999_> regenerating k8s creds for tools.whichsub & tools.permission-denied-test T176027 [tools]
15:40 <andrewbogott> moving tools-worker-1002, 1005, 1028 to eqiad1-r [tools]
01:35 <bd808> Shutdown tools-webgrid-lighttpd-1419.tools.eqiad.wmflabs via horizon (T217152) [tools]
01:29 <bd808> Depooled tools-webgrid-lighttpd-1419.tools.eqiad.wmflabs (T217152) [tools]
01:26 <bd808> Disabled job queues and rescheduled continuous jobs away from tools-exec-14{33,34,35,36,37,38,39,40,41,42}.tools.eqiad.wmflabs (T217152) [tools]
2019-02-26 §
20:51 <gtirloni> reboot tools-package-builder-02 (unresponsive) [tools]
19:01 <gtirloni> pushed updated docker images [tools]
17:30 <andrewbogott> draining and cordoning tools-worker-1027 for a region migration test [tools]
2019-02-25 §
23:20 <bstorm_> Depooled tools-sgeexec-0914 and tools-sgeexec-0915 for T217066 [tools]
21:41 <andrewbogott> depooling tools-sgeexec-0911, tools-sgeexec-0912, tools-sgeexec-0913 to test T217066 [tools]
13:11 <chicocvenancio> PAWS: Stopped AABot notebook pod T217010 [tools]
12:54 <chicocvenancio> PAWS: Restarted Criscod notebook pod T217010 [tools]
12:21 <chicocvenancio> PAWS: killed proxy and hub pods to attempt to get it to see routes to open notebooks servers to no avail. Restarted BernhardHumm's notebook pod T217010 [tools]