2016-05-09
§
|
20:57 |
<hashar> |
Unbroke puppet on integration-raita.integration.eqiad.wmflabs . Puppet was blocked because role::ci::raita was no more. Fixed by rebasing https://gerrit.wikimedia.org/r/#/c/208024 T115330 |
[releng] |
20:13 |
<hashar> |
beta: salt -v '*' cmd.run 'dpkg --purge libganglia1 ganglia-monitor; rm -fR /etc/ganglia' # T134808 |
[releng] |
20:06 |
<hashar> |
CI, removing ganglia configuration entirely via: salt -v '*' cmd.run 'rm -fRv /etc/ganglia' # T134808 |
[releng] |
20:04 |
<hashar> |
CI, removing ganglia configuration entirely via: salt -v '*' cmd.run 'dpkg --purge ganglia-monitor' # T134808 |
[releng] |
16:32 |
<jzerebecki> |
reloading zuul for 3e2ab56..d663fd0 |
[releng] |
15:39 |
<andrewbogott> |
migrating deployment-flourine to labvirt1009 |
[releng] |
15:39 |
<hashar> |
Adding label contintLabsSlave to integration-slave-jessie1001 and integration-slave-jessie1002 |
[releng] |
15:26 |
<hashar> |
Creating integration-slave-jessie-1001 T95545 |
[releng] |
2016-05-04
§
|
21:28 |
<cscott> |
deployed puppet FQDN domain patch for OCG: https://gerrit.wikimedia.org/r/286068 and restarted ocg on deployment-pdf0[12] |
[releng] |
15:03 |
<hashar> |
beta-scap: deployment-tin.deployment-prep.eqiad.wmflabs Name or service not known |
[releng] |
15:03 |
<hashar> |
beta-scap: deployment-tin.deployment-prep.eqiad.wmflabs |
[releng] |
12:24 |
<hashar> |
deleting Jenkins job mediawiki-core-phpcs , replaced by Nodepool version mediawiki-core-phpcs-trusty T133976 |
[releng] |
12:11 |
<hashar> |
beta: restarted nginx on varnish caches ( systemctl restart nginx.service ) since they were not listening on port 443 #T134362 |
[releng] |
11:07 |
<hashar> |
restarted CI puppetmaster (out of memory leak) |
[releng] |
10:57 |
<hashar> |
CI: mass upgrading deb packages |
[releng] |
10:53 |
<hashar> |
beta: clearing out leftover apt conf that points to unreachable web proxy : salt -v '*' cmd.run "find /etc/apt -name '*-proxy' -delete" |
[releng] |
10:48 |
<hashar> |
Manually fixing nginx upgrade on deployment-cache-text04 and deployment-cache-upload04 see T134362 for details |
[releng] |
09:27 |
<hashar> |
deployment-cache-text04 systemctl stop varnish-frontend.service . To clear out all the stuck CLOSE_WAIT connections T134346 |
[releng] |
08:33 |
<hashar> |
fixed puppet on deployment-cache-text04 (race condition generating puppet.conf ) |
[releng] |
2016-05-03
§
|
23:21 |
<bd808> |
Changed "Maximum Number of Retries" for ssh agent launch in jenkins for deployment-tin from "0" to "10" |
[releng] |
23:01 |
<twentyafterfour> |
rebooting deployment-tin |
[releng] |
23:00 |
<bd808> |
Jenkins agent on deployment-tin not spawning; investigating |
[releng] |
20:02 |
<hashar> |
Restarting Jenkins |
[releng] |
16:49 |
<hashar> |
Notice: /Stage[main]/Contint::Packages::Python/Package[pypy]/ensure: ensure changed 'purged' to 'present' | T134235 |
[releng] |
16:46 |
<hashar> |
Refreshing Nodepool Jessie image to have it include pypy | T134235 poke @jayvdb |
[releng] |
14:49 |
<mobrovac> |
deployment-tin rebooting it |
[releng] |
14:25 |
<hashar> |
beta salt -v '*' pkg.upgrade |
[releng] |
14:19 |
<hashar> |
beta: added unattended upgrade to Hiera::deployment-prep |
[releng] |
13:30 |
<hashar> |
Restarted nslcd on deployment-tin , pam was refusing authentication for some reason |
[releng] |
13:29 |
<hashar> |
beta: got rid of a leftover Wikidata/Wikibase patch that broke scap salt -v 'deployment-tin*' cmd.run 'sudo -u jenkins-deploy git -C /srv/mediawiki-staging/php-master/extensions/Wikidata/ checkout -- extensions/Wikibase/lib/maintenance/populateSitesTable.php' |
[releng] |
13:23 |
<hashar> |
deployment-tin force upgraded HHVM from 3.6 to 3.12 |
[releng] |
09:42 |
<hashar> |
adding puppet class contint::slave_scripts to deployment-sca01 and deployment-sca02 . Ships multigit.sh T134239 |
[releng] |
09:31 |
<hashar> |
Deleting CI slave deployment-cxserver03 , added deployment-sca01 and deployment-sca02 in Jenkins. T134239 |
[releng] |
09:28 |
<hashar> |
deployment-sca01 removing puppet lock /var/lib/puppet/state/agent_catalog_run.lock and running puppet again |
[releng] |
09:26 |
<hashar> |
Applying puppet class role::ci::slave::labs::common on deployment-sca01 and deployment-sca02 (cxserver and parsoid being migrated T134239 ) |
[releng] |
03:33 |
<kart_> |
Deleted deployment-cxserver03, replaced by deployment-sca0x |
[releng] |
2016-05-02
§
|
21:27 |
<cscott> |
updated OCG to version b775e612520f9cd4acaea42226bcf34df07439f7 |
[releng] |
21:26 |
<hashar> |
Nodepool is acting just fine: Demand from gearman: ci-trusty-wikimedia: 457 | <AllocationRequest for 455.0 of ci-trusty-wikimedia> |
[releng] |
21:25 |
<hashar> |
restarted qa-morebots "2016-05-02 21:22:23,599 ERROR: Died in main event loop" |
[releng] |
21:23 |
<hashar> |
gallium: enqueued 488 jobs directly in Gearman. That is to test https://gerrit.wikimedia.org/r/#/c/286462/ ( mediawiki/extensions to hhvm/zend5.5 on Nodepool). Progress /home/hashar/gerrit-286462.log |
[releng] |
21:20 |
<hashar> |
gallium: enqueued 488 jobs directly in Gearman. That is to test https://gerrit.wikimedia.org/r/#/c/286462/ ( mediawiki/extensions to hhvm/zend5.5 on Nodepool). Progress /home/hashar/gerrit-286462.log |
[releng] |
21:19 |
<cscott> |
updated OCG to version b775e612520f9cd4acaea42226bcf34df07439f7 |
[releng] |
20:14 |
<hashar> |
MediaWiki phpunit jobs to run on Nodepool instances \O/ |
[releng] |
16:41 |
<urandom> |
Forcing puppet run and restarting Cassandra on deployment-restbase0[1-2] : T126629 |
[releng] |
16:40 |
<urandom> |
Cherry-picking https://gerrit.wikimedia.org/r/operations/puppet refs/changes/78/284078/12 to deployment-puppetmaster : T126629 |
[releng] |
16:24 |
<urandom> |
Restarat Cassandra on deployment-restbase0[1-2] : T126629 |
[releng] |