2016-02-09
§
|
12:46 |
<hashar> |
Mass testing php loop of death: salt -v '*slave*' cmd.run 'timeout 2s /srv/deployment/integration/slave-scripts/bin/php --version' |
[releng] |
12:40 |
<hashar> |
mass rebooting CI slaves from wikitech |
[releng] |
12:39 |
<hashar> |
salt -v '*' cmd.run "bash -c 'cd /srv/deployment/integration/slave-scripts; git pull'" |
[releng] |
12:33 |
<hashar> |
all slaves dieing due to PHP looping |
[releng] |
12:02 |
<legoktm> |
re-enabling puppet on all trusty/precise slaves |
[releng] |
11:20 |
<legoktm> |
cherry-picked https://gerrit.wikimedia.org/r/#/c/269370/ on integration-puppetmaster |
[releng] |
11:20 |
<legoktm> |
enabling puppet just on integration-slave-trusty-1012 |
[releng] |
11:13 |
<legoktm> |
disabling puppet on all *(trusty|precise)* slaves |
[releng] |
10:25 |
<hashar> |
pooling in integration-slave-trusty-1018 |
[releng] |
03:19 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/269359 |
[releng] |
02:53 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/238988 |
[releng] |
00:39 |
<hashar> |
gallium edited /usr/share/python/zuul/local/lib/python2.7/site-packages/zuul/trigger/gerrit.py and modified: replication_timeout = 300 -> replication_timeout = 10 |
[releng] |
00:37 |
<hashar> |
live hacking Zuul code to have it stop sleeping() on force merge |
[releng] |
00:36 |
<hashar> |
killing zuul |
[releng] |
2016-02-08
§
|
23:48 |
<legoktm> |
finally deploying https://gerrit.wikimedia.org/r/269327 |
[releng] |
23:14 |
<hashar> |
zuul promote --pipeline gate-and-submit --changes 269065,2 https://gerrit.wikimedia.org/r/#/c/269065/ |
[releng] |
23:10 |
<hashar> |
pooling integration-slave-precise-1001 1002 1004 |
[releng] |
22:47 |
<hashar> |
Err need to reboot newly provisioned instances before adding them to Jenkins (kernel upgrade,apache restart etc) |
[releng] |
22:45 |
<hashar> |
Pooled https://integration.wikimedia.org/ci/computer/integration-slave-precise-1003/ |
[releng] |
22:25 |
<hashar> |
integration-slave-precise-{1001-1004} applied role::ci::slave::labs, running puppet in slaves. I have added the instances as Jenkins slaves and put them offline. Whenever puppet is done, we can mark them online in Jenkins then monitor the jobs running on them are working properly |
[releng] |
22:15 |
<hashar> |
Provisioning integration-slave-precise-{1001-1004} https://phabricator.wikimedia.org/T126274 (need more php53 slots) |
[releng] |
22:13 |
<hashar> |
Deleted cache-rsync instance superseded by castor instance |
[releng] |
22:10 |
<hashar> |
Deleting pmcache.integration.eqiad.wmflabs (was to investigate various kind of central caches). |
[releng] |
20:14 |
<marxarelli> |
aborting pending mediawiki-extensions-php53 job for CheckUser |
[releng] |
20:08 |
<bd808> |
toggled "Enable Gearman" off and on in Jenkins to wake up deployment-bastion workers |
[releng] |
14:54 |
<hashar> |
nodepool: refreshed snapshot image , Image ci-jessie-wikimedia-1454942958 in wmflabs-eqiad is ready |
[releng] |
14:47 |
<hashar> |
regenerated nodepool reference image (got rid of grunt-cli https://gerrit.wikimedia.org/r/269126 ) |
[releng] |
09:41 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/269093 https://gerrit.wikimedia.org/r/269094 |
[releng] |
09:36 |
<hashar> |
restarting integration puppetmaster (out of memory / cannot fork) |
[releng] |
06:11 |
<bd808> |
tgr set $wgAuthenticationTokenVersion on beta cluster (test run for T124440) |
[releng] |
02:09 |
<legoktm[NE]> |
deploying https://gerrit.wikimedia.org/r/268047 |
[releng] |
00:57 |
<legoktm[NE]> |
deploying https://gerrit.wikimedia.org/r/268031 |
[releng] |
2016-02-04
§
|
22:08 |
<jzerebecki> |
reloading zuul for bed7be1..f57b7e2 |
[releng] |
21:51 |
<hashar> |
salt-key -d integration-slave-jessie-1001.eqiad.wmflabs |
[releng] |
21:50 |
<hashar> |
salt-key -d integration-slave-precise-1011.eqiad.wmflabs |
[releng] |
20:11 |
<hashar> |
ping |
[releng] |
20:08 |
<hashar> |
All wikis to 1.27.0-wmf.12 No troubles so far congratulations to everyone involved @wikimedia #wikimedia |
[releng] |
18:37 |
<marxarelli> |
Reloading Zuul to deploy Iccf4f48fe5bf964a4c4e6db3f404f152628a4a24 |
[releng] |
10:04 |
<hashar> |
beta: nuking the whole l10n cache ( https://phabricator.wikimedia.org/T123366 ) |
[releng] |
10:03 |
<hashar> |
beta-scap-eqiad fails with <tt>AttributeError: 'bool' object has no attribute 'encode'</tt> |
[releng] |
10:02 |
<hashar> |
https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/ is broken :( |
[releng] |
00:57 |
<bd808> |
Got deployment-bastion processing Jenkins jobs again via instructions left by my past self at https://phabricator.wikimedia.org/T72597#747925 |
[releng] |
00:43 |
<bd808> |
Jenkins agent on deployment-bastion.eqiad doing the trick where it doesn't pick up jobs again |
[releng] |