2015-01-13
§
|
17:43 |
<hashar> |
Restarted deadlocked Zuul , which drops ALL events. Reason is Gerrit lost connection with its database which is not handled by Zuul . See https://wikitech.wikimedia.org/wiki/Incident_documentation/20150106-Zuul |
[releng] |
17:32 |
<James_F> |
No effect from restarting Gearman. Getting Timo to restart Zuul. |
[releng] |
17:30 |
<James_F> |
No effect. Restarting Gearman. |
[releng] |
17:27 |
<James_F> |
Trying a shutdown/re-enable of Jenkins. |
[releng] |
13:59 |
<YuviPanda> |
running scap via jenkins, hitting buttons on https://integration.wikimedia.org/ci/job/beta-scap-eqiad/ |
[releng] |
13:58 |
<YuviPanda> |
scap failed |
[releng] |
13:58 |
<YuviPanda> |
running scap, because why not |
[releng] |
13:58 |
<YuviPanda> |
modified PrivateSettings.php to make it use wikiadmin user rather than mw user |
[releng] |
13:51 |
<YuviPanda> |
created user wikiadmin on deployment-db1 |
[releng] |
04:31 |
<James_F> |
Zuul now appears fixed. |
[releng] |
04:29 |
<marktraceur> |
FORCE RESTART ZUUL (James_F told me to) |
[releng] |
04:28 |
<marktraceur> |
Attempting graceful zuul restart |
[releng] |
04:26 |
<marktraceur> |
Reloaded zuul to see if it will help |
[releng] |
04:24 |
<James_F> |
Took the gallium Jenkins slave offline, disconnected and relaunched; no effect. |
[releng] |
04:19 |
<James_F> |
Disabled and re-enabled Gearman, no effect. |
[releng] |
04:15 |
<James_F> |
Flagged and unflagged Jenkins for restart, no effect. |
[releng] |
04:10 |
<James_F> |
Jenkins/zuul/whatever not working, investigating. |
[releng] |
01:12 |
<marxarelli> |
Added twentyafterfour as an admin to the integration project |
[releng] |
01:08 |
<bd808> |
Added Dduvall as an admin in the integration project |
[releng] |
00:55 |
<bd808> |
zuul is plugged up because a gate-and-submit job failed on integration-slave1006 (ZeroBanner clone problem) and then the patch was force merged |
[releng] |
00:48 |
<bd808> |
deleted ntegration-slave1006:/mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/extensions/ZeroBanner to try and clear the git clone problem there |
[releng] |
00:35 |
<bd808> |
git clone failure in https://integration.wikimedia.org/ci/job/mediawiki-extensions-hhvm/131/console blocking merge of core patch |
[releng] |
2015-01-07
§
|
16:25 |
<YuviPanda> |
added milimetric to NDA sudo’ers groups |
[releng] |
10:57 |
<hashar> |
Taught Jenkins configuration about Java 8. Name: "Ubuntu - OpenJdk 8" JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64/ . Only available on Trusty slaves though |
[releng] |
10:56 |
<hashar> |
installed openjdk 8 on CI Trusty labs slaves https://phabricator.wikimedia.org/T85964 |
[releng] |
10:34 |
<hashar> |
varnish text cache is back up. Had to delete /etc/varnish and reinstall varnish from scratch + rerun puppet. |
[releng] |
10:25 |
<hashar> |
deleting /etc/varnish on deplloyment-cache-text02 and running puppet |
[releng] |
10:24 |
<hashar> |
beta varnish text cache is broken. The vcl refuses to load because of undefined probes |
[releng] |
10:01 |
<hashar> |
restarted deployment-cache-mobile03 and deployment-cache-text02 |
[releng] |
09:50 |
<hashar> |
rebooting deployment-cache-bits01 |
[releng] |
00:41 |
<Krinkle> |
rm -rf slave-scripts and re-cloning from integration/jenkins.git on all slaves (under sudo, just like puppet originally did) - git-status and jshint both work fine now |
[releng] |
00:40 |
<Krinkle> |
Permissions of deployment/integration/slave-scripts on labs slave are all screwed up (git-status says files are dirty, but when run as root git-status is clean and jshint also works fine via sudo) |
[releng] |
00:29 |
<Krinkle> |
Tried reconnecting Gearman, relaunching slave agents. Force-restarting Zuul now. |
[releng] |
00:15 |
<Krinkle> |
Permissions in deployment/integration/slave-scripts on integration-slave1003 are screwed up as well |
[releng] |