51-100 of 10000 results (33ms)
2016-11-28 §
07:38 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1092 - T151272 (duration: 00m 47s) [production]
07:18 <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Added comments to db1044 status - T150802 (duration: 00m 45s) [production]
07:08 <marostegui> Stop MySQl on db1095 - maintenance T150802 [production]
07:03 <marostegui> Stop MySQL on db1044 - (depooled) maintenance - T150802 [production]
02:05 <Reedy> fixed localisationupdate clone of mw core on tin due to T151676 [production]
02:00 <l10nupdate@tin> LocalisationUpdate failed: git pull of core failed [production]
2016-11-27 §
21:47 <legoktm> created wmf/1.29.0-wmf.3 branch pointing at master for mediawiki/extensions/ElectronPdfService to workaround T151725 [production]
09:35 <elukey> removed all the files not used in /tmp on stat1002 after a follow up with the owner [production]
06:20 <ori@tin> Synchronized php-1.29.0-wmf.3/api.php: Bandaid: make API reqs fail fast if User-Agent ~= Parsoid and Host ~= eu.wikipedia.org (duration: 00m 50s) [production]
05:36 <ori> Commented-out lived-hack from mw1290; if we see memory growth now, Parsoid would be strongly implicated. [production]
05:33 <ori> With Parsoid requests hacked to fail fast, mw1290 is not showing the kind of aggressive growth in memory usage we're seeing on other API servers [production]
05:30 <godog> roll restarting hhvm across api_cluster when hhvm uses more than 40% of memory [production]
05:21 <ori> Live-hacked api.php on mw1290 to die if request user-agent contains 'Parsoid'; restarted HHVM. [production]
05:17 <godog> roll restarting hhvm across api_cluster when hhvm uses more than 40% of memory [production]
04:57 <godog> roll-restart hhvm on api_appcluster for on machines with hhvm leaking memory [production]
03:22 <godog> roll-restart hhvm across api_appserver [production]
02:41 <godog> dumping hhvm backtraces and roll-restart on affected api machines [production]
02:00 <l10nupdate@tin> LocalisationUpdate failed: git pull of core failed [production]
2016-11-26 §
15:35 <elukey> deleted tmp files on stat1002's /tmp partition because of disk space consumption. Will follow up with the owner. [production]
13:36 <Krenair> ran refreshLinks on angwiki for T151584, it ran into issues with the EventBus extension at the links tables step [production]
12:29 <volans> manually fixed the checkout of mediawiki core on stat1002 and stat1003 that was causing Puppet failing [production]
02:22 <l10nupdate@tin> ResourceLoader cache refresh completed at Sat Nov 26 02:22:26 UTC 2016 (duration 4m 18s) [production]
02:18 <l10nupdate@tin> scap sync-l10n completed (1.29.0-wmf.3) (duration: 06m 28s) [production]
2016-11-25 §
20:09 <Krinkle> mwscript deleteEqualMessages.php --wiki angwiki (T45917) [production]
17:15 <jynus> drop database vewikimedia (deleted wiki) from sanitarium and its slaves [production]
14:22 <Reedy> delete oathauth row on wikitech for user Liuxinyu970226 per T144805 [production]
14:16 <Reedy> delete oathauth row on wikitech for user Shoichi per T144805 [production]
11:05 <ema> uploaded libvmod-{netmapper,tbf,vslp} to carbon main component (T150660) [production]
10:20 <_joe_> upgrading HHVM across codfw [production]
09:23 <_joe_> upgraded hhvm on the debug hosts [production]
08:58 <_joe_> uploading hhvm_3.12.7+dfsg-1+wmf4 to apt [production]
08:53 <volans> restarting zotero on sca1003, almost out of RAM, puppet failing [production]
08:52 <elukey> restarting Yarn and HDFS masters on analytics100[12] (Hadoop cluster) to complete the openjdk update [production]
07:51 <marostegui> Stopping replication db1052 for maintenance - T151607 [production]
02:22 <l10nupdate@tin> ResourceLoader cache refresh completed at Fri Nov 25 02:22:40 UTC 2016 (duration 4m 20s) [production]
02:18 <l10nupdate@tin> scap sync-l10n completed (1.29.0-wmf.3) (duration: 06m 48s) [production]
2016-11-24 §
17:25 <_joe_> turned off additional workers for htmlcacheupdate on commonswiki as the queue has reduced to acceptable sizes (T151196) [production]
15:03 <ema> uploaded varnish 4.1.3-1wm4 to carbon main component, replacing version 3.0.6plus-wm9 (T150660) [production]
14:47 <ema> uploaded varnishkafka 1.0.12-1 to carbon main component, replacing version 1.0.7-1 (T150660) [production]
13:31 <akosiaris> balance the load between thumbor1001 and thumbor1002 evenly [production]
13:31 <akosiaris@puppetmaster1001> conftool action : set/weight=10; selector: thumbor1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=thumbor', 'service=thumbor']) [production]
13:20 <akosiaris@puppetmaster1001> conftool action : set/weight=5; selector: thumbor1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=thumbor', 'service=thumbor']) [production]
13:04 <akosiaris@puppetmaster1001> conftool action : set/weight=20; selector: thumbor1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=thumbor', 'service=thumbor']) [production]
12:54 <gilles> restarting thumbor on thumbor1001 [production]
12:49 <akosiaris> lower thumbor1001 load by 50% to easy debugging [production]
12:48 <gilles> restarting thumbor on thumbor1001 [production]
12:48 <akosiaris@puppetmaster1001> conftool action : set/weight=5; selector: thumbor1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=thumbor', 'service=thumbor']) [production]
12:36 <elukey> launched preferred-replica-election to re-add kafka1022 among the Topic partition leader brokers of the Analytics Kafka cluster (all metrics looks good) [production]
11:41 <hoo> Killed the Wikidata JSON dump creation on snapshot1007: Wont succeed before Monday, due to T151356 [production]
10:13 <_joe_> running commonswiki htmlCacheUpdate jobs on terbium to catch up with the backlog, monitoring caches for vhtcpd queue overflows T151196 [production]