2016-02-11
§
|
22:53 |
<thcipriani> |
shutting down deployment-bastion |
[releng] |
21:28 |
<hashar> |
pooling back slaves 1001 to 1006 |
[releng] |
21:18 |
<hashar> |
re enabling hhvm service on slaves ( https://phabricator.wikimedia.org/T126594 ) Some symlink is missing and only provided by the upstart script grrrrrrr https://phabricator.wikimedia.org/T126658 |
[releng] |
20:52 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/270098 |
[releng] |
20:35 |
<hashar> |
depooling the six recent slaves: /usr/lib/x86_64-linux-gnu/hhvm/extensions/current/luasandbox.so cannot open shared object file |
[releng] |
20:29 |
<hashar> |
pooling integration-slave-trusty-1004 integration-slave-trusty-1005 integration-slave-trusty-1006 |
[releng] |
20:14 |
<hashar> |
pooling integration-slave-trusty-1001 integration-slave-trusty-1002 integration-slave-trusty-1003 |
[releng] |
19:35 |
<marxarelli> |
modifying deployment server node in jenkins to point to deployment-tin |
[releng] |
19:27 |
<thcipriani> |
running sudo salt -b '10%' '*' cmd.run 'puppet agent -t' from deployment-salt |
[releng] |
19:27 |
<twentyafterfour> |
Keeping notes on the ticket: https://phabricator.wikimedia.org/T126537 |
[releng] |
19:24 |
<thcipriani> |
moving deployment-bastion to deployment-tin |
[releng] |
17:59 |
<hashar> |
recreated instances with proper names: integration-slave-trusty-{1001-1006} |
[releng] |
17:52 |
<hashar> |
Created integration-slave-trusty-{1019-1026} as m1.large (note 1023 is an exception it is for Android). Applied role::ci::slave , lets wait for puppet to finish |
[releng] |
17:42 |
<Krinkle> |
Currently testing https://gerrit.wikimedia.org/r/#/c/268802/ in Beta Labs |
[releng] |
17:27 |
<hashar> |
Depooling all the ci.medium slaves and deleting them. |
[releng] |
17:27 |
<hashar> |
I tried. The ci.medium instances are too small and MediaWiki tests really need 1.5GBytes of memory :-( |
[releng] |
16:00 |
<hashar> |
rebuilding integration-dev https://phabricator.wikimedia.org/T126613 |
[releng] |
15:27 |
<Krinkle> |
Deploy Zuul config change https://gerrit.wikimedia.org/r/269976 |
[releng] |
11:46 |
<hashar> |
salt -v '*' cmd.run '/etc/init.d/apache2 restart' might help for Wikidata browser tests failling |
[releng] |
11:31 |
<hashar> |
disabling hhvm service on CI slaves ( https://phabricator.wikimedia.org/T126594 , cherry picked both patches ) |
[releng] |
10:50 |
<hashar> |
reenabled puppet on CI. All transitioned to a 128MB tmpfs (was 512MB) |
[releng] |
10:16 |
<hashar> |
pooling back integration-slave-trusty-1009 and integration-slave-trusty-1010 (tmpfs shrunken) |
[releng] |
10:06 |
<hashar> |
disabling puppet on all CI slaves. Trying to lower tmpfs 512MB to 128MB ( https://gerrit.wikimedia.org/r/#/c/269880/ ) |
[releng] |
02:45 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/269853 https://gerrit.wikimedia.org/r/269893 |
[releng] |
2016-02-10
§
|
23:54 |
<hashar_> |
depooling Trusty slaves that only have 2GB of ram that is not enough. https://phabricator.wikimedia.org/T126545 |
[releng] |
22:55 |
<hashar_> |
gallium: find /var/lib/jenkins/config-history/config -type f -wholename '*/2015*' -delete ( https://phabricator.wikimedia.org/T126552 ) |
[releng] |
22:34 |
<Krinkle> |
Zuul is back up and procesing Gerrit events, but jobs are still queued indefinitely. Jenkins is not accepting new jobs |
[releng] |
22:31 |
<Krinkle> |
Full restart of Zuul. Seems Gearman/Zuul got stuck. All executors were idling. No new Gerrit events processed either. |
[releng] |
21:22 |
<legoktm> |
cherry-picking https://gerrit.wikimedia.org/r/#/c/269370/ on integration-puppetmaster again |
[releng] |
21:16 |
<hashar> |
CI dust have settled. Krinkle and I have pooled a lot more Trusty slaves to accommodate for the overload caused by switching to php55 (jobs run on Trusty) |
[releng] |
21:08 |
<hashar> |
pooling trusty slaves 1009, 1010, 1021, 1022 with 2 executors (they are ci.medium) |
[releng] |
20:38 |
<hashar> |
cancelling mediawiki-core-jsduck-publish and mediawiki-core-doxygen-publish jobs manually. They will catch up on next merge |
[releng] |
20:34 |
<Krinkle> |
Pooled integration-slave-trusty-1019 (new) |
[releng] |
20:28 |
<Krinkle> |
Pooled integration-slave-trusty-1020 (new) |
[releng] |
20:24 |
<Krinkle> |
created integration-slave-trusty-1019 and integration-slave-trusty-1020 (ci1.medium) |
[releng] |
20:18 |
<hashar> |
created integration-slave-trusty-1009 and 1010 (trusty ci.medium) |
[releng] |
20:06 |
<hashar> |
creating integration-slave-trusty-1021 and integration-slave-trusty-1022 (ci.medium) |
[releng] |
19:48 |
<greg-g> |
that cleanup was done by apergos |
[releng] |
19:48 |
<greg-g> |
did cleanup across all integration slaves, some were very close to out of room. results: https://phabricator.wikimedia.org/P2587 |
[releng] |
19:43 |
<hashar> |
Dropping slaves Precise m1.large integration-slave-precise-1014 and integration-slave-precise-1013 , most load shifted to Trusty (php53 -> php55 transition) |
[releng] |
18:20 |
<Krinkle> |
Creating a Trusty slave to support increased demand following MediaWIki php53(precise)>php55(trusty) bump |
[releng] |
16:06 |
<jzerebecki> |
reloading zuul for 41a92d5..5b971d1 |
[releng] |
15:42 |
<jzerebecki> |
reloading zuul for 639dd40..41a92d5 |
[releng] |
14:12 |
<jzerebecki> |
recover a bit of disk space: integration-saltmaster:~# salt --show-timeout '*slave*' cmd.run 'rm -rf /mnt/jenkins-workspace/workspace/*WikibaseQuality*' |
[releng] |
13:46 |
<jzerebecki> |
reloading zuul for 639dd40 |
[releng] |
13:15 |
<jzerebecki> |
reloading zuul for 3be81c1..e8e0615 |
[releng] |
08:07 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/269619 |
[releng] |
08:03 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/269613 and https://gerrit.wikimedia.org/r/269618 |
[releng] |