3501-3550 of 10000 results (65ms)
2017-07-20 ยง
13:42 <ema> cp1050 stuck at 'Initializing firmware interfaces...', trying to powerdown/powerup [production]
13:37 <zeljkof> EU SWAT finished [production]
13:29 <zfilipin@tin> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:366546|Revert "Revert "Stop RelatedArticles A/B test and clean up config"" (T169948)]] (duration: 00m 46s) [production]
13:29 <cmjohnson1> downtimed restbase-dev100[1-3] to power off and move ssds to newly racked restbase-dev100[4-6] phab task: T166181 [production]
13:29 <ema> cp1050 stuck rebooting, power-cycling [production]
13:28 <zfilipin@tin> Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:366546|Revert "Revert "Stop RelatedArticles A/B test and clean up config"" (T169948)]] (duration: 00m 47s) [production]
13:12 <zfilipin@tin> Synchronized wmf-config/throttle.php: SWAT: [[gerrit:366529|Add new throttle rule (T171146)]] (duration: 00m 48s) [production]
12:58 <ema@neodymium> conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org [production]
12:55 <ema@neodymium> conftool action : set/pooled=no; selector: name=acamar.wikimedia.org [production]
12:37 <ema@neodymium> conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org [production]
12:25 <ema@neodymium> conftool action : set/pooled=no; selector: name=acamar.wikimedia.org [production]
10:20 <zeljkof> Reloading Zuul to deploy 80b9d855443a2f572d877b280783110684344c5d [releng]
09:17 <hashar> Spawning and pooling integration-slave-docker-1003 as replacement to integration-slave-docker-1000 (broken) - T150502 [releng]
09:04 <ema> eqiad cache_text/upload: upgrade to varnish 4.1.7-1wm1 and reboot for kernel updates [production]
09:04 <hashar> Restored CI cache storage (castor) on a fresh new instance. Cache is empty though so jobs will be a bit slower until the cache is populated - T171148 [production]
09:03 <hashar> Restoring castorby updating all jobs to point to castor02 ( https://gerrit.wikimedia.org/r/366524 ) Starts with a cold cache :( - T171148 [releng]
09:02 <moritzm> uploaded apache2 2.4.10-10+deb8u10+wmf1 (rebase of WMF-specific patches on top of latest DSA) to apt.wikimedia.org/jessie [production]
08:53 <hashar> Created castor02.integration.eqiad.wmflabs with puppet role role::ci::castor::server and adding it to Jenkins. Will then update the Jenkins jobs to point to it - T171148 [releng]
08:34 <marostegui> Force a BBU relearn on db1016 - T166344 [production]
08:29 <ema@neodymium> conftool action : set/pooled=yes; selector: name=cp3048.esams.wmnet [production]
08:25 <hashar> CI is restored albeit in degraded mode (lack of Castor cache) - T171148 [production]
08:01 <marostegui> Stop replication on labsdb1011 for maintenance - T153743 [production]
08:00 <hashar> Disabled castor entirely via https://gerrit.wikimedia.org/r/366520 . The instance is broken - T171148 [releng]
07:55 <marostegui> Start importing s2 into labsdb1011 - T153743 [production]
07:55 <hashar> Refreshing all Jenkins jobs defined in JJB in order to then disable castor entirely for T171148 [releng]
07:48 <godog> restart diamond on serpens/seaborgium to pick up the updated CA [production]
07:41 <elukey> powercycle cp3048 - mgmt reachable - T171145 [production]
07:09 <_joe_> rebooting castor, jobs are failing, and no one seems able to login [releng]
07:05 <_joe_> adding myself to projectadmins for integration, trying to troubleshoot castor [releng]
06:54 <marostegui> Force a BBU relearn on db1016 - T166344 [production]
06:24 <mutante> netmon1002 - librenms: fix permissions on /srv/librenms/rrd data after rsyncing, mismatching UIDs vs netmon1001 and rsyncd in chroot-issue [production]
06:00 <oblivian@puppetmaster1001> conftool action : set/pooled=inactive; selector: name=cp3048.esams.wmnet [production]
05:46 <TimStarling> on contint1001 restarted zuul and zuul-merger [production]
05:30 <TimStarling> on contint1001 restarted jenkins [production]
05:05 <marostegui> Configure replication for s2 on labsdb1009 and labsdb1010 - T153743 [production]
04:42 <mutante> netmon1002 - restarted Apache for LDAP issue - librenms.wm.org switched back to it, after rsyncing rrd data, re-enabling puppet [production]
04:05 <andrewbogott> restarting rabbitmq-server on labcontrol1001 [production]
04:00 <chasemp> tools-webgrid-lighttpd-1402:~# service nslcd restart && service nscd restart [tools]
03:57 <chasemp> tools-exec-1428:~# service nslcd restart && service nscd restart [tools]
03:57 <bd808> Redtarted cron, nscd, nslcd on tools-cron-01 [tools]
03:45 <chasemp> tools-puppetmaster-01:~# service nslcd restart && service nscd restart [tools]
03:44 <chasemp> tools-puppetmaster-01:~# service nslcd restart && service nscd restart [tools]
03:37 <bd808> Restarted apache on tools-puppetmaster-01 [tools]
03:34 <andrewbogott> service nova-network restart on labnet1001 [production]
03:32 <andrewbogott> service uwsgi-labspuppetbackend restart on labcontrol1001 [production]
03:02 <l10nupdate@tin> LocalisationUpdate failed: git pull of extensions failed [production]
02:22 <mutante> netmon1001 - rsyncing librenms rrd data to netmon1002 - T159756 [production]
01:38 <thcipriani> scap on beta was failing because during the ldap downtime puppet created a shadow mwdeploy user, fixed using vipw and vigr [releng]
01:17 <andrewbogott> restarting keystone on labcontrol1001 [production]
01:14 <twentyafterfour> phabricator upgrade complete [production]