2018-02-15 §
12:51 <arturo> aborrero@tools-webgrid-generic-1401:~$ sudo apt-upgrade -u upgrade trusty-wikimedia [tools]
2018-02-14 §
13:09 <arturo> the reboot was OK, the server seems working and kubectl sees all the pods running in the deployment (T187315) [tools]
13:04 <arturo> reboot tools-paws-master-01 for T187315 [tools]
2018-02-11 §
01:28 <zhuyifei1999_> `# find /home/ -maxdepth 1 -perm -o+w \! -uid 0 -exec chmod -v o-w {} \;` Affected: only /home/tr8dr, mode 0777 -> 0775 [tools]
01:21 <zhuyifei1999_> `# find /data/project/ -maxdepth 1 -perm -o+w \! -uid 0 -exec chmod -v o-w {} \;` Affected tools: wikisource-tweets, gsociftttdev, dow, ifttt-testing, elobot. All mode 2777 -> 2775 [tools]
2018-02-09 §
10:35 <arturo> deploy https://gerrit.wikimedia.org/r/#/c/409226/ T179343 T182562 T186846 [tools]
06:15 <bd808> Killed orphan processes owned by iabot, dupdet, and wsexport scattered across the webgrid nodes [tools]
05:07 <bd808> Killed 4 orphan php-fcgi processes from jembot that were running on tools-webgrid-lighttpd-1426 [tools]
05:06 <bd808> Killed 4 orphan php-fcgi processes from jembot that were running on tools-webgrid-lighttpd-1411 [tools]
05:05 <bd808> Killed 1 orphan php-fcgi process from jembot that were running on tools-webgrid-lighttpd-1409 [tools]
05:02 <bd808> Killed 4 orphan php-fcgi processes from jembot that were running on tools-webgrid-lighttpd-1421 and pegging the cpu there [tools]
04:56 <bd808> Rescheduled 30 of the 60 tools running on tools-webgrid-lighttpd-1421 (T186830) [tools]
04:39 <bd808> Killed 4 orphan php-fcgi processes from jembot that were running on tools-webgrid-lighttpd-1417 and pegging the cpu there [tools]
2018-02-08 §
18:38 <arturo> aborrero@tools-k8s-master-01:~$ sudo kubectl uncordon tools-worker-1002.tools.eqiad.wmflabs [tools]
18:35 <arturo> aborrero@tools-worker-1002:~$ sudo apt-upgrade -u upgrade jessie-wikimedia -v [tools]
18:33 <arturo> aborrero@tools-worker-1002:~$ sudo apt-upgrade -u upgrade oldstable -v [tools]
18:28 <arturo> cordon & drain tools-worker-1002.tools.eqiad.wmflabs [tools]
18:10 <arturo> uncordon tools-paws-worker-1019. Package upgrades were OK. [tools]
18:08 <arturo> aborrero@tools-paws-worker-1019:~$ sudo apt-upgrade upgrade stable -v [tools]
18:06 <arturo> aborrero@tools-paws-worker-1019:~$ sudo apt-upgrade upgrade stretch-wikimedia -v [tools]
18:02 <arturo> cordon tools-paws-worker-1019 to do some package upgrades [tools]
17:29 <arturo> repool tools-exec-1401.tools.eqiad.wmflabs. Package upgrades were OK. [tools]
17:20 <arturo> aborrero@tools-exec-1401:~$ sudo apt-upgrade upgrade trusty-updates -vy [tools]
17:15 <arturo> aborrero@tools-exec-1401:~$ sudo apt-upgrade upgrade trusty-wikimedia -vy [tools]
17:11 <arturo> depool tools-exec-1401.tools.eqiad.wmflabs to do some package upgrades [tools]
14:22 <arturo> it was some kind of transient error. After a second puppet run across the fleet, all seems fine [tools]
13:53 <arturo> deploy https://gerrit.wikimedia.org/r/#/c/407465/ which is causing some puppet issues. Investigating. [tools]
2018-02-06 §
13:15 <arturo> deploy https://gerrit.wikimedia.org/r/#/c/408529/ to tools-services-01 [tools]
13:05 <arturo> unpublish/publish trusty-tools repo [tools]
13:03 <arturo> install aptly v0.9.6-1 in tools-services-01 for T186539 after adding it to trusty-tools repo (self contained) [tools]
2018-02-05 §
17:58 <arturo> publishing/unpublishing trusty-tools repo in tools-services-01 to address T186539 [tools]
13:27 <arturo> for the record, not a single warning or error (orange/red messages) in puppet in the toolforge cluster [tools]
13:06 <arturo> deploying fix for T186230 using clush [tools]
2018-02-03 §
01:04 <chicocvenancio> killed io intensive process in bastion-03 "vltools python3 ./broken_ref_anchors.py" [tools]
2018-01-31 §
22:54 <chasemp> add bstorm to sudoers as root [tools]
2018-01-29 §
20:02 <chasemp> add zhuyifei1999_ tools root for T185577 [tools]
20:01 <chasemp> blast a puppet run to see if any errors are persistent [tools]
2018-01-28 §
22:49 <chicocvenancio> killed compromised session generating miner processes [tools]
22:48 <chicocvenancio> killed miner processes in tools-bastion-03 [tools]
2018-01-27 §
00:55 <arturo> at tools-static-11 the kernel OOM killer stopped git gc at about 20% :-( [tools]
00:25 <arturo> (/srv is almost full) aborrero@tools-static-11:/srv/cdnjs$ sudo git gc --aggressive [tools]
2018-01-25 §
23:47 <arturo> fix last deprecation warnings in tools-elastic-03, tools-elastic-02, tools-proxy-01 and tools-proxy-02 by replacing by hand configtimeout with http_configtimeout in /etc/puppet/puppet.conf [tools]
23:20 <arturo> T179386 aborrero@tools-clushmaster-01:~$ clush -w @all 'sudo puppet agent -t -v' [tools]
05:25 <arturo> deploying misctools and jobutils 1.29 for T179386 [tools]
2018-01-23 §
19:41 <madhuvishy> Add bstorm to project admins [tools]
15:48 <bd808> Admin clean up; removed Coren, Ryan Lane, and Springle. [tools]
14:17 <chasemp> add me, arturo, chico to sudoers and removed marc [tools]
2018-01-22 §
18:32 <arturo> T181948 T185314 deploying jobutils and misctools v1.28 in the cluster [tools]
11:21 <arturo> puppet in the cluster is mostly fine, except for a couple of deprecation warnings, a conn timeout to services-01 and https://phabricator.wikimedia.org/T181948#3916790 [tools]
10:31 <arturo> aborrero@tools-clushmaster-01:~$ clush -w @all 'sudo puppet agent -t -v' <--- check again how is the cluster with puppet [tools]