2018-10-07 §
21:48 <zhuyifei1999_> maintain-kubeusers on tools-k8s-master-01 seems to be in an infinite loop of 10 seconds. installed python3-dbg [tools]
21:44 <zhuyifei1999_> journal on tools-k8s-master-01 is full of etcd failures, did a puppet run, nothing interesting happens [tools]
2018-09-21 §
12:35 <arturo> cleanup stalled apt preference files (pinning) in tools-clushmaster-01 [tools]
12:14 <arturo> T205078 same for {jessie,stretch}-wikimedia [tools]
12:12 <arturo> T205078 upgrade trusty-wikimedia packages (git-fat, debmonitor) [tools]
11:57 <arturo> T205078 purge packages smbclient libsmbclient libwbclient0 python-samba samba-common samba-libs from trusty machines [tools]
2018-09-17 §
09:13 <arturo> T204481 aborrero@tools-mail:~$ sudo exiqgrep -i | xargs sudo exim -Mrm [tools]
2018-09-14 §
11:22 <arturo> T204267 stop the corhist tool (k8s) because is hammering the wikidata API [tools]
10:51 <arturo> T204267 stop the openrefine-wikidata tool (k8s) because is hammering the wikidata API [tools]
2018-09-08 §
10:35 <gtirloni> restarted cron and truncated /var/log/exim4/paniclog (T196137) [tools]
2018-09-07 §
05:07 <legoktm> uploaded/imported toollabs-webservice_0.42_all.deb [tools]
2018-08-27 §
23:39 <bd808> `# exec-manage repool tools-webgrid-generic-1402.eqiad.wmflabs` T202932 [tools]
23:28 <bd808> Restarted down instance tools-webgrid-generic-1402 & ran apt-upgrade [tools]
22:36 <zhuyifei1999_> `# exec-manage depool tools-webgrid-generic-1402.eqiad.wmflabs` T202932 [tools]
2018-08-22 §
13:02 <arturo> I used this command: `sudo exim -bp | sudo exiqgrep -i | xargs sudo exim -Mrm` [tools]
13:00 <arturo> remove all emails in tools-mail.eqiad.wmflabs queue, 3378 bounce msgs, mostly related to @qq.com [tools]
2018-08-19 §
09:12 <legoktm> rebuilding python/base k8s images for https://gerrit.wikimedia.org/r/453665 (T202218) [tools]
2018-08-14 §
21:02 <legoktm> rebuilt php7.2 docker images for https://gerrit.wikimedia.org/r/452755 [tools]
01:08 <legoktm> switched tools.coverme and tools.wikiinfo to use PHP 7.2 [tools]
2018-08-13 §
23:31 <legoktm> rebuilding docker images for webservice upgrade [tools]
23:16 <legoktm> published toollabs-webservice_0.41_all.deb [tools]
23:06 <legoktm> fixed permissions of tools-package-builder-01:/srv/src/tools-webservice [tools]
2018-08-09 §
10:40 <arturo> T201602 upgrade packages from jessie-backports (excluding python-designateclient) [tools]
10:30 <arturo> T201602 upgrade packages from jessie-wikimedia [tools]
10:27 <arturo> T201602 upgrade packages from trusty-updates [tools]
2018-08-08 §
10:00 <zhuyifei1999_> building & publishing toollabs-webservice 0.40 deb, and all Docker images T156626 T148872 T158244 [tools]
2018-08-06 §
12:33 <arturo> T197176 installing texlive-full in toolforge [tools]
2018-08-01 §
14:31 <andrewbogott> temporarily depooling tools-exec-1409, 1410, 1414, 1419, 1427, 1428 to try to give labvirt1009 a break [tools]
2018-07-30 §
20:33 <bd808> Started rebuilding all Kubernetes Docker images to pick up latest apt updates [tools]
04:47 <legoktm> added toollabs-webservice_0.39_all.deb to stretch-tools [tools]
2018-07-27 §
04:52 <zhuyifei1999_> rebuilding python/base docker container T190274 [tools]
2018-07-25 §
19:02 <chasemp> tools-worker-1004 reboot [tools]
19:01 <chasemp> ifconfig eth0:fakenfs netmask up on tools-worker-1004 (late log) [tools]
2018-07-18 §
13:24 <arturo> upgrading packages from `stretch-wikimedia` T199905 [tools]
13:18 <arturo> upgrading packages from `stable` T199905 [tools]
12:51 <arturo> upgrading packages from `oldstable` T199905 [tools]
12:31 <arturo> upgrading packages from `trusty-updates` T199905 [tools]
12:16 <arturo> upgrading packages from `jessie-wikimedia` T199905 [tools]
12:08 <arturo> upgrading packages from `trusty-wikimedia` T199905 [tools]
2018-06-30 §
18:15 <chicocvenancio> pushed new config to PAWS to fix dumps nfs mountpoint [tools]
16:40 <zhuyifei1999_> because tools-paws-master-01 was having ~1000 loadavg due to NFS having issues and processes stuck in D state [tools]
16:39 <zhuyifei1999_> reboot tools-paws-master-01 [tools]
16:35 <zhuyifei1999_> `root@tools-paws-master-01:~# sed -i 's/^labstore1006.wikimedia.org/#labstore1006.wikimedia.org/' /etc/fstab` [tools]
16:34 <andrewbogott> "sed -i '/labstore1006/d' /etc/fstab" everywhere [tools]
2018-06-29 §
17:41 <bd808> Rescheduling continuous jobs away from tools-exec-1408 where load is high [tools]
17:11 <bd808> Rescheduled jobs away from toole-exec-1404 where linkwatcher is currently stealing most of the CPU (T123121) [tools]
16:46 <bd808> Killed orphan tool owned processes running on the job grid. Mostly jembot and wsexport php-cgi processes stuck in deadlock following an OOM. T182070 [tools]
2018-06-28 §
19:50 <chasemp> tools-clushmaster-01:~$ clush -w @all 'sudo umount -fl /mnt/nfs/dumps-labstore1006.wikimedia.org' [tools]
18:02 <chasemp> tools-clushmaster-01:~$ clush -w @all "sudo umount -fl /mnt/nfs/dumps-labstore1007.wikimedia.org" [tools]
17:53 <chasemp> tools-clushmaster-01:~$ clush -w @all "sudo puppet agent --disable 'labstore1007 outage'" [tools]