releng SAL

1-50 of 4248 results (10ms)

2016-02-12 §
23:54	<hashar>	beta cluster broken since 20:30 UTC https://logstash-beta.wmflabs.org/#/dashboard/elasticsearch/fatalmonitor havent looked	[releng]
17:36	<hashar>	salt -v 'slave-trusty' cmd.run 'apt-get -y install texlive-generic-extra' # T126422	[releng]
17:32	<hashar>	adding texlive-generic-extra on CI slaves by cherry picking https://gerrit.wikimedia.org/r/#/c/270322/ - T126422	[releng]
17:19	<hashar>	get rid of integration-dev it is broken somehow	[releng]
17:10	<hashar>	Nodepool back at spawning instances. contintcloud has been migrated in wmflabs	[releng]
16:51	<thcipriani>	running sudo salt '*' -b '10%' deploy.fixurl to fix deployment-prep trebuchet urls	[releng]
16:31	<hashar>	bd808 added support for saltbot to update tasks automagically!!!! T108720	[releng]
16:15	<hashar>	the pool of CI slaves is exhausted, no more jobs running (scheduled labs maintenance)	[releng]
03:10	<yurik>	attempted to sync graphoid from gerrit 270166 from deployment-tin, but it wouldn't sync. Tried to git pull sca02, submodules wouldn't pull	[releng]
2016-02-11 §
22:53	<thcipriani>	shutting down deployment-bastion	[releng]
21:28	<hashar>	pooling back slaves 1001 to 1006	[releng]
21:18	<hashar>	re enabling hhvm service on slaves ( https://phabricator.wikimedia.org/T126594 ) Some symlink is missing and only provided by the upstart script grrrrrrr https://phabricator.wikimedia.org/T126658	[releng]
20:52	<legoktm>	deploying https://gerrit.wikimedia.org/r/270098	[releng]
20:35	<hashar>	depooling the six recent slaves: /usr/lib/x86_64-linux-gnu/hhvm/extensions/current/luasandbox.so cannot open shared object file	[releng]
20:29	<hashar>	pooling integration-slave-trusty-1004 integration-slave-trusty-1005 integration-slave-trusty-1006	[releng]
20:14	<hashar>	pooling integration-slave-trusty-1001 integration-slave-trusty-1002 integration-slave-trusty-1003	[releng]
19:35	<marxarelli>	modifying deployment server node in jenkins to point to deployment-tin	[releng]
19:27	<thcipriani>	running sudo salt -b '10%' '*' cmd.run 'puppet agent -t' from deployment-salt	[releng]
19:27	<twentyafterfour>	Keeping notes on the ticket: https://phabricator.wikimedia.org/T126537	[releng]
19:24	<thcipriani>	moving deployment-bastion to deployment-tin	[releng]
17:59	<hashar>	recreated instances with proper names: integration-slave-trusty-{1001-1006}	[releng]
17:52	<hashar>	Created integration-slave-trusty-{1019-1026} as m1.large (note 1023 is an exception it is for Android). Applied role::ci::slave , lets wait for puppet to finish	[releng]
17:42	<Krinkle>	Currently testing https://gerrit.wikimedia.org/r/#/c/268802/ in Beta Labs	[releng]
17:27	<hashar>	Depooling all the ci.medium slaves and deleting them.	[releng]
17:27	<hashar>	I tried. The ci.medium instances are too small and MediaWiki tests really need 1.5GBytes of memory :-(	[releng]
16:00	<hashar>	rebuilding integration-dev https://phabricator.wikimedia.org/T126613	[releng]
15:27	<Krinkle>	Deploy Zuul config change https://gerrit.wikimedia.org/r/269976	[releng]
11:46	<hashar>	salt -v '*' cmd.run '/etc/init.d/apache2 restart' might help for Wikidata browser tests failling	[releng]
11:31	<hashar>	disabling hhvm service on CI slaves ( https://phabricator.wikimedia.org/T126594 , cherry picked both patches )	[releng]
10:50	<hashar>	reenabled puppet on CI. All transitioned to a 128MB tmpfs (was 512MB)	[releng]
10:16	<hashar>	pooling back integration-slave-trusty-1009 and integration-slave-trusty-1010 (tmpfs shrunken)	[releng]
10:06	<hashar>	disabling puppet on all CI slaves. Trying to lower tmpfs 512MB to 128MB ( https://gerrit.wikimedia.org/r/#/c/269880/ )	[releng]
02:45	<legoktm>	deploying https://gerrit.wikimedia.org/r/269853 https://gerrit.wikimedia.org/r/269893	[releng]
2016-02-10 §
23:54	<hashar_>	depooling Trusty slaves that only have 2GB of ram that is not enough. https://phabricator.wikimedia.org/T126545	[releng]
22:55	<hashar_>	gallium: find /var/lib/jenkins/config-history/config -type f -wholename '/2015' -delete ( https://phabricator.wikimedia.org/T126552 )	[releng]
22:34	<Krinkle>	Zuul is back up and procesing Gerrit events, but jobs are still queued indefinitely. Jenkins is not accepting new jobs	[releng]
22:31	<Krinkle>	Full restart of Zuul. Seems Gearman/Zuul got stuck. All executors were idling. No new Gerrit events processed either.	[releng]
21:22	<legoktm>	cherry-picking https://gerrit.wikimedia.org/r/#/c/269370/ on integration-puppetmaster again	[releng]
21:16	<hashar>	CI dust have settled. Krinkle and I have pooled a lot more Trusty slaves to accommodate for the overload caused by switching to php55 (jobs run on Trusty)	[releng]
21:08	<hashar>	pooling trusty slaves 1009, 1010, 1021, 1022 with 2 executors (they are ci.medium)	[releng]
20:38	<hashar>	cancelling mediawiki-core-jsduck-publish and mediawiki-core-doxygen-publish jobs manually. They will catch up on next merge	[releng]
20:34	<Krinkle>	Pooled integration-slave-trusty-1019 (new)	[releng]
20:28	<Krinkle>	Pooled integration-slave-trusty-1020 (new)	[releng]
20:24	<Krinkle>	created integration-slave-trusty-1019 and integration-slave-trusty-1020 (ci1.medium)	[releng]
20:18	<hashar>	created integration-slave-trusty-1009 and 1010 (trusty ci.medium)	[releng]
20:06	<hashar>	creating integration-slave-trusty-1021 and integration-slave-trusty-1022 (ci.medium)	[releng]
19:48	<greg-g>	that cleanup was done by apergos	[releng]
19:48	<greg-g>	did cleanup across all integration slaves, some were very close to out of room. results: https://phabricator.wikimedia.org/P2587	[releng]
19:43	<hashar>	Dropping slaves Precise m1.large integration-slave-precise-1014 and integration-slave-precise-1013 , most load shifted to Trusty (php53 -> php55 transition)	[releng]
18:20	<Krinkle>	Creating a Trusty slave to support increased demand following MediaWIki php53(precise)>php55(trusty) bump	[releng]