151-200 of 445 results (12ms)
2016-03-18 §
15:47 <chasemp> had to kill stalkboten as it was logging constant errors filling logs to the tune of hundreds of gigs [tools]
15:36 <chasemp> cleanup huge log collection for broken bot: /srv/project/tools/project/betacommand-dev/tspywiki/irc/logs# rm -fR SpamBotLog.log\.* [tools]
2016-03-11 §
20:57 <mutante> reverted font changes - puppet runs recovering [tools]
20:37 <mutante> more puppet issues due to font dependencies on trusty, on it [tools]
19:39 <mutante> should a tools-exec server be influenced by font packages on an mw appserver? [tools]
19:39 <mutante> fixed puppet runs on tools-exec (gerrit 276792) [tools]
2016-03-02 §
14:56 <chasemp> qdel 3956069 and 3758653 for abusing auth [tools]
2016-02-28 §
20:08 <bd808> Removed unwanted NFS mounts from tools-elastic-01.tools.eqiad.wmflabs [tools]
2016-02-26 §
19:08 <bd808> Upgraded Elasticsearch on tools-elastic-0[123] to 1.7.5 [tools]
2016-02-24 §
19:46 <chasemp> runonce deployed for https://gerrit.wikimedia.org/r/#/c/272891/ [tools]
2016-02-19 §
15:58 <chasemp> rerollout tools nfs shaping pilot for sanity in anticipation of formalization [tools]
09:21 <_joe_> killed cluebot3 instance on tools-exec-1207, writing 20 M/s to the error log [tools]
00:50 <yuvipanda> failover services to services-02 [tools]
2016-02-18 §
22:57 <valhallasw`cloud> restarted gridengine-master on tools-grid-master, otherwise all webservices will stay down [tools]
20:37 <yuvipanda> failover proxy back to tools-proxy-01 [tools]
19:46 <chasemp> repool labvirt1003 and depool labvirt1004 [tools]
18:19 <chasemp> draining nodes from labvirt1001 [tools]
2016-02-16 §
21:33 <chasemp> reboot of bastion-1002 [tools]
2016-02-12 §
19:56 <chasemp> nfs traffic shaping pilot round 2 [tools]
2016-02-05 §
22:01 <chasemp> throttle some vm nfs write speeds [tools]
2016-02-03 §
03:00 <YuviPanda> upgraded flannel on all hosts running it [tools]
2016-01-29 §
21:25 <YuviPanda> restarted image-resize-calc manually, no service.manifest file [tools]
2016-01-27 §
23:07 <YuviPanda> removed all members of templatetiger, added self instead, removed active shell sessions [tools]
20:24 <chasemp> master stop, truncate accounting log to accounting.01272016, master start [tools]
19:34 <chasemp> master start grid master [tools]
19:23 <chasemp> stopped master [tools]
19:11 <YuviPanda> depooled tools-webgrid-1405 to prep for restart, lots of stuck processes [tools]
18:29 <valhallasw`cloud> job 2551539 is ifttt, which is also running as 2700629. Killing 2551539 . [tools]
18:26 <valhallasw`cloud> messages repeatedly reports "01/27/2016 18:26:17|worker|tools-grid-master|E|execd@tools-webgrid-generic-1405.tools.eqiad.wmflabs reports running job (2551539.1/master) in queue "webgrid-generic@tools-webgrid-generic-1405.tools.eqiad.wmflabs" that was not supposed to be there - killing". SSH'ing there to investigate [tools]
18:24 <valhallasw`cloud> 'sleep' test job also seems to work without issues [tools]
18:23 <valhallasw`cloud> no errors in log file, qstat works [tools]
18:23 <chasemp> master sge restarted post dump and restart for jobs db [tools]
18:22 <valhallasw`cloud> messages file reports 'Wed Jan 27 18:21:39 UTC 2016 db_load_sge_maint_pre_jobs_dump_01272016' [tools]
18:20 <chasemp> master db_load -f /root/sge_maint_pre_jobs_dump_01272016 sge_job [tools]
18:19 <valhallasw`cloud> dumped jobs database to /root/sge_maint_pre_jobs_dump_01272016, 4.6M [tools]
18:17 <valhallasw`cloud> SGE Configuration successfully saved to /root/sge_maint_01272016 directory. [tools]
18:14 <chasemp> grid master stopped [tools]
2016-01-26 §
21:28 <YuviPanda> qstat -u '*' | grep E | awk '{print $1}' | xargs -L1 qmod -cj [tools]
21:16 <chasemp> reboot tools-exec-1217.tools.eqiad.wmflabs [tools]
2016-01-25 §
20:30 <YuviPanda> switched over cron host to tools-cron-01, manually copied all old cron files from tools-submit to tools-cron-01 [tools]
19:06 <chasemp> kill python merge/merge-unique.py tools-exec-1213 as it seemed to be overwhelming nfs [tools]
2016-01-21 §
22:24 <YuviPanda> deleted tools-redis-01 and -02 (are on 1001 and 1002 now) [tools]
21:13 <YuviPanda> repooled exec nodes on labvirt1010 [tools]
21:08 <YuviPanda> gridengine-master started, verified shadow hasn't started [tools]
21:00 <YuviPanda> stop gridengine master [tools]
20:51 <YuviPanda> repooled exec nodes on labvirt1007 was last message [tools]
20:51 <YuviPanda> repooled exec nodes on labvirt1006 [tools]
20:39 <YuviPanda> failover tools-static too tools-web-static-01 [tools]
20:38 <YuviPanda> failover tools-checker to tools-checker-01 [tools]
20:32 <YuviPanda> depooled exec nodes on 1007 [tools]