2017-07-20
ยง
|
13:29 |
<ema> |
cp1050 stuck rebooting, power-cycling |
[production] |
13:28 |
<zfilipin@tin> |
Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:366546|Revert "Revert "Stop RelatedArticles A/B test and clean up config"" (T169948)]] (duration: 00m 47s) |
[production] |
13:12 |
<zfilipin@tin> |
Synchronized wmf-config/throttle.php: SWAT: [[gerrit:366529|Add new throttle rule (T171146)]] (duration: 00m 48s) |
[production] |
12:58 |
<ema@neodymium> |
conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org |
[production] |
12:55 |
<ema@neodymium> |
conftool action : set/pooled=no; selector: name=acamar.wikimedia.org |
[production] |
12:37 |
<ema@neodymium> |
conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org |
[production] |
12:25 |
<ema@neodymium> |
conftool action : set/pooled=no; selector: name=acamar.wikimedia.org |
[production] |
10:20 |
<zeljkof> |
Reloading Zuul to deploy 80b9d855443a2f572d877b280783110684344c5d |
[releng] |
09:17 |
<hashar> |
Spawning and pooling integration-slave-docker-1003 as replacement to integration-slave-docker-1000 (broken) - T150502 |
[releng] |
09:04 |
<ema> |
eqiad cache_text/upload: upgrade to varnish 4.1.7-1wm1 and reboot for kernel updates |
[production] |
09:04 |
<hashar> |
Restored CI cache storage (castor) on a fresh new instance. Cache is empty though so jobs will be a bit slower until the cache is populated - T171148 |
[production] |
09:03 |
<hashar> |
Restoring castorby updating all jobs to point to castor02 ( https://gerrit.wikimedia.org/r/366524 ) Starts with a cold cache :( - T171148 |
[releng] |
09:02 |
<moritzm> |
uploaded apache2 2.4.10-10+deb8u10+wmf1 (rebase of WMF-specific patches on top of latest DSA) to apt.wikimedia.org/jessie |
[production] |
08:53 |
<hashar> |
Created castor02.integration.eqiad.wmflabs with puppet role role::ci::castor::server and adding it to Jenkins. Will then update the Jenkins jobs to point to it - T171148 |
[releng] |
08:34 |
<marostegui> |
Force a BBU relearn on db1016 - T166344 |
[production] |
08:29 |
<ema@neodymium> |
conftool action : set/pooled=yes; selector: name=cp3048.esams.wmnet |
[production] |
08:25 |
<hashar> |
CI is restored albeit in degraded mode (lack of Castor cache) - T171148 |
[production] |
08:01 |
<marostegui> |
Stop replication on labsdb1011 for maintenance - T153743 |
[production] |
08:00 |
<hashar> |
Disabled castor entirely via https://gerrit.wikimedia.org/r/366520 . The instance is broken - T171148 |
[releng] |
07:55 |
<marostegui> |
Start importing s2 into labsdb1011 - T153743 |
[production] |
07:55 |
<hashar> |
Refreshing all Jenkins jobs defined in JJB in order to then disable castor entirely for T171148 |
[releng] |
07:48 |
<godog> |
restart diamond on serpens/seaborgium to pick up the updated CA |
[production] |
07:41 |
<elukey> |
powercycle cp3048 - mgmt reachable - T171145 |
[production] |
07:09 |
<_joe_> |
rebooting castor, jobs are failing, and no one seems able to login |
[releng] |
07:05 |
<_joe_> |
adding myself to projectadmins for integration, trying to troubleshoot castor |
[releng] |
06:54 |
<marostegui> |
Force a BBU relearn on db1016 - T166344 |
[production] |
06:24 |
<mutante> |
netmon1002 - librenms: fix permissions on /srv/librenms/rrd data after rsyncing, mismatching UIDs vs netmon1001 and rsyncd in chroot-issue |
[production] |
06:00 |
<oblivian@puppetmaster1001> |
conftool action : set/pooled=inactive; selector: name=cp3048.esams.wmnet |
[production] |
05:46 |
<TimStarling> |
on contint1001 restarted zuul and zuul-merger |
[production] |
05:30 |
<TimStarling> |
on contint1001 restarted jenkins |
[production] |
05:05 |
<marostegui> |
Configure replication for s2 on labsdb1009 and labsdb1010 - T153743 |
[production] |
04:42 |
<mutante> |
netmon1002 - restarted Apache for LDAP issue - librenms.wm.org switched back to it, after rsyncing rrd data, re-enabling puppet |
[production] |
04:05 |
<andrewbogott> |
restarting rabbitmq-server on labcontrol1001 |
[production] |
04:00 |
<chasemp> |
tools-webgrid-lighttpd-1402:~# service nslcd restart && service nscd restart |
[tools] |
03:57 |
<chasemp> |
tools-exec-1428:~# service nslcd restart && service nscd restart |
[tools] |
03:57 |
<bd808> |
Redtarted cron, nscd, nslcd on tools-cron-01 |
[tools] |
03:45 |
<chasemp> |
tools-puppetmaster-01:~# service nslcd restart && service nscd restart |
[tools] |
03:44 |
<chasemp> |
tools-puppetmaster-01:~# service nslcd restart && service nscd restart |
[tools] |
03:37 |
<bd808> |
Restarted apache on tools-puppetmaster-01 |
[tools] |
03:34 |
<andrewbogott> |
service nova-network restart on labnet1001 |
[production] |
03:32 |
<andrewbogott> |
service uwsgi-labspuppetbackend restart on labcontrol1001 |
[production] |
03:02 |
<l10nupdate@tin> |
LocalisationUpdate failed: git pull of extensions failed |
[production] |
02:22 |
<mutante> |
netmon1001 - rsyncing librenms rrd data to netmon1002 - T159756 |
[production] |
01:38 |
<thcipriani> |
scap on beta was failing because during the ldap downtime puppet created a shadow mwdeploy user, fixed using vipw and vigr |
[releng] |
01:17 |
<andrewbogott> |
restarting keystone on labcontrol1001 |
[production] |
01:14 |
<twentyafterfour> |
phabricator upgrade complete |
[production] |
01:10 |
<twentyafterfour> |
begin (belated) phabricator upgrade, expect momentary downtime. |
[production] |
00:09 |
<dereckson@tin> |
Synchronized php-1.30.0-wmf.9/resources/src/mediawiki.widgets/mw.widgets.SearchInputWidget.js: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (3/3) (duration: 00m 46s) |
[production] |
00:08 |
<dereckson@tin> |
Synchronized php-1.30.0-wmf.9/resources/Resources.php: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (2/3) (duration: 00m 46s) |
[production] |
00:08 |
<dereckson@tin> |
Synchronized php-1.30.0-wmf.9/includes/widget/SearchInputWidget.php: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (1/3) (duration: 00m 46s) |
[production] |