2019-03-07 §
23:31 <bd808> Updated DNS to point login.tools.wmflabs.org at (Stretch bastion) [tools]
04:15 <bd808> Killed 3 orphan processes on Trusty grid [tools]
04:01 <bd808> Cleared error state on a large number of Stretch grid queues which had been disabled by LDAP and/or NFS hiccups (T217280) [tools]
00:49 <zhuyifei1999_> clushed misctools 1.37 upgrade on @bastion,@cron,@bastion-stretch T217406 [tools]
00:38 <zhuyifei1999_> published misctools 1.37 T217406 [tools]
00:34 <zhuyifei1999_> begin building misctools 1.37 using debuild T217406 [tools]
2019-03-06 §
13:57 <gtirloni> fixed SSH warnings in tools-clushmaster-02 [tools]
2019-03-04 §
19:07 <bstorm_> umounted /mnt/nfs/dumps-labstore1006.wikimedia.org for T217473 [tools]
14:05 <gtirloni> rebooted tools-docker-registry-{03,04}, tools-puppetmaster-02 and tools-puppetdb-01 (load avg >45, not accessible) [tools]
2019-03-03 §
20:54 <andrewbogott> cleaning out /tmp on tools-exec-1412 [tools]
2019-02-28 §
19:36 <zhuyifei1999_> built with debuild instead T217297 [tools]
19:08 <zhuyifei1999_> test failures during build, see ticket [tools]
18:55 <zhuyifei1999_> start building jobutils 1.36 T217297 [tools]
2019-02-27 §
20:41 <andrewbogott> restarting nginx on tools-checker-01 [tools]
19:34 <andrewbogott> uncordoning tools-worker-1028, 1002 and 1005, now in eqiad1-r [tools]
16:20 <zhuyifei1999_> regenerating k8s creds for tools.whichsub & tools.permission-denied-test T176027 [tools]
15:40 <andrewbogott> moving tools-worker-1002, 1005, 1028 to eqiad1-r [tools]
01:35 <bd808> Shutdown tools-webgrid-lighttpd-1419.tools.eqiad.wmflabs via horizon (T217152) [tools]
01:29 <bd808> Depooled tools-webgrid-lighttpd-1419.tools.eqiad.wmflabs (T217152) [tools]
01:26 <bd808> Disabled job queues and rescheduled continuous jobs away from tools-exec-14{33,34,35,36,37,38,39,40,41,42}.tools.eqiad.wmflabs (T217152) [tools]
2019-02-26 §
20:51 <gtirloni> reboot tools-package-builder-02 (unresponsive) [tools]
19:01 <gtirloni> pushed updated docker images [tools]
17:30 <andrewbogott> draining and cordoning tools-worker-1027 for a region migration test [tools]
2019-02-25 §
23:20 <bstorm_> Depooled tools-sgeexec-0914 and tools-sgeexec-0915 for T217066 [tools]
21:41 <andrewbogott> depooling tools-sgeexec-0911, tools-sgeexec-0912, tools-sgeexec-0913 to test T217066 [tools]
13:11 <chicocvenancio> PAWS: Stopped AABot notebook pod T217010 [tools]
12:54 <chicocvenancio> PAWS: Restarted Criscod notebook pod T217010 [tools]
12:21 <chicocvenancio> PAWS: killed proxy and hub pods to attempt to get it to see routes to open notebooks servers to no avail. Restarted BernhardHumm's notebook pod T217010 [tools]
09:50 <gtirloni> rebooted tools-sgeexec-09{16,22,40} (T216988) [tools]
09:41 <gtirloni> rebooted tools-sgeexec-09{16,22,40} [tools]
08:37 <zhuyifei1999_> uncordon tools-worker-1015.tools.eqiad.wmflabs [tools]
08:34 <legoktm> hard rebooted tools-worker-1015 via horizon [tools]
07:48 <zhuyifei1999_> systemd stuck in D state. :( [tools]
07:44 <zhuyifei1999_> I saved dmesg and process list to a few files in /root if that helps debugging [tools]
07:43 <zhuyifei1999_> D states are not responding to SIGKILL. Will reboot. [tools]
07:37 <zhuyifei1999_> tools-worker-1015.tools.eqiad.wmflabs having severe NFS issues (all NFS accessing processes are stuck in D state). Draining. [tools]
2019-02-22 §
16:29 <gtirloni> upgraded and rebooted tools-puppetmaster-01 (new kernel) [tools]
15:59 <gtirloni> started tools-puppetmaster-01 (new size: m1.large) [tools]
15:13 <gtirloni> shutdown tools-puppetmaster-01 [tools]
2019-02-21 §
09:59 <gtirloni> upgraded all packages in all stretch nodes [tools]
00:12 <zhuyifei1999_> forcing puppet run on tools-k8s-master-01 [tools]
00:08 <zhuyifei1999_> running /usr/local/bin/git-sync-upstream on tools-puppetmaster-01 to speed puppet changes up [tools]
2019-02-20 §
23:30 <zhuyifei1999_> begin rebuilding all docker images T178601 T193646 T215683 [tools]
23:25 <zhuyifei1999_> upgraded toollabs-webservice on tools-bastion-02 to 0.44 (newly-built version) [tools]
23:19 <zhuyifei1999_> this was built for stretch. hopefully it works for all distros [tools]
23:17 <zhuyifei1999_> begin build new tools-webservice package T178601 T193646 T215683 [tools]
21:57 <andrewbogott> moving tools-static-13 to a new virt host [tools]
21:34 <andrewbogott> moving the tools-static IP from tools-static-13 to tools-static-12 [tools]
19:17 <andrewbogott> moving tools-bastion-02 to labvirt1004 [tools]
16:56 <andrewbogott> moving tools-paws-worker-1003 [tools]