401-450 of 1245 results (7ms)
2017-08-14 §
16:40 <elukey> analytics1034 back in service after swapping the eth cable - T172633 [analytics]
2017-08-10 §
20:06 <milimetric> stopped Wikimetrics web and queue on wikimetrics-01.eqiad.wmflabs because the queue ran into errors connecting to the database (max 10 connections limit reached) [analytics]
08:59 <elukey> updated librdkafka1 to 0.9.4.1 on eventlog1001 [analytics]
2017-08-08 §
18:39 <elukey> restart projectview-hourly-wf-2017-8-8-14, pageview-druid-hourly-wf-2017-8-8-14, pageview-hourly-wf-2017-8-8-14 via Hue (analytics1055 disk failure) [analytics]
14:20 <elukey> restart varnishkafka statsv/eventlogging instances to pick up https://gerrit.wikimedia.org/r/#/c/370637/ (kafka protocol explicitly set to 0.9.0.1) [analytics]
2017-08-06 §
11:02 <elukey> stop yarn on analytics1034 to reload the tg3 driver - T172633 [analytics]
2017-08-03 §
16:15 <ottomata> druid cluster restarted with 0.9.2 mysql-metadata-storage extension, un-suspending oozie druid jobs [analytics]
14:11 <ottomata> pausing oozie druid jobs and doing a cluster upgrade/restart again to make sure updated version of mysql-metadata-storage jar is properly loaded [analytics]
09:56 <elukey> set piwik in maintenance mode to allow mysql updates [analytics]
08:08 <elukey> restarted Druid jobs failed over night (drud_loader.py error) and due to Hive metastore restart [analytics]
08:03 <elukey> restart hive-metastore to pick up new JVM Xms settings [analytics]
2017-08-02 §
14:34 <ottomata> beginning druid upgrade to 0.92 (take 2 :) ) [analytics]
14:23 <elukey> restart hive-server to pick up JVM Xms4g change [analytics]
14:22 <ottomata> suspending druid oozie jobs [analytics]
2017-08-01 §
18:57 <madhuvishy> Bumped instance quota to 24 instances (nova quota-update analytics --instances 24) [analytics]
17:24 <ottomata> beginning druid upgrade to 0.9.2 http://druid.io/docs/0.9.2/operations/rolling-updates.html [analytics]
17:10 <ottomata> pausing all druid oozie coordinators [analytics]
12:49 <elukey> restart hive daemons on analytics1003 to pick up new jvm settings (bigger Xmx, JMX ports) [analytics]
10:05 <elukey> suspended again webrequest-load-bundle as prep step to restart the hive daemons [analytics]
07:58 <elukey> suspended webrequest-load-bundle as prep step to restart the hive daemons [analytics]
07:03 <elukey> restarted mobile_apps-session_metrics-coord-global-30days failed job via Hue [analytics]
2017-07-31 §
13:45 <elukey> suspended webrequest-load-bundle as prep step to restart hive metastore/server [analytics]
10:34 <elukey> restart hive-server on an1003 - beeline not connecting, thrift errors [analytics]
2017-07-28 §
07:55 <elukey> update nodejs to 6.11 on aqs1004 (testing prod node after beta qa) [analytics]
07:54 <elukey> re-run webrequest-load-wf-upload-2017-7-28-6 from Hue (was playing with eth0 issues on an1034) [analytics]
02:08 <ottomata> stat1002: disabled puppet, umounted /tmp, /home and /a, poweroff [analytics]
2017-07-26 §
21:01 <mforns> Deployed refinery using scap, then deployed onto hdfs [analytics]
18:57 <mforns> Deployed refinery-source using jenkins [analytics]
2017-07-25 §
17:43 <bd808> Forced puppet run on zk1-1.analytics.eqiad.wmflabs after elukey fixed hiera settings [analytics]
17:34 <bd808> Puppet broken on zk1-1.analytics.eqiad.wmflabs with "$clusters[$cluster_name] is :undef, not a hash or array at /etc/puppet/modules/profile/manifests/zookeeper/server.pp:22" [analytics]
15:24 <elukey> restart cassandra loading after maintenance via hue [analytics]
13:06 <elukey> stop cassandra load bundle, restarting AQS for jvm updates [analytics]
12:13 <elukey> executed sudo apt-get remove openjdk-8-jre openjdk-8-jre-headless on druid nodes [analytics]
2017-07-24 §
14:24 <ottomata> restarted mysql-eventbus eventlogging consumer with new consumer group [analytics]
2017-07-20 §
20:31 <nuria_> restaring eventlogging on eventlog1001 [analytics]
20:30 <nuria_> deploying eventlogging c1c2c39411ccd002ff8cea197bc535155213f5fb and restarting [analytics]
18:18 <ottomata> deleted instance deployment-eventlogging03 in favor of new instance deployment-eventlog02 [analytics]
17:14 <ottomata> killed tranquility instances tranq-banners and tranq-netflow running on druid1003 in joal's screen sessions [analytics]
2017-07-18 §
13:04 <ottomata> adding unique index on meta_id and index on meta_dt to mediawiki_page_{create,delete,move,undelete}_1 on db1046 MySQL eventlogging master [analytics]
2017-07-17 §
16:27 <elukey> set innodb_flush_log_at_trx_commit on bohrium to 2 and sync_binlog=300 to reduce iowait - T164073 [analytics]
14:31 <elukey> set innodb_flush_log_at_trx_commit on bohrium to 1 (default value)- T164073 [analytics]
2017-07-12 §
13:48 <fdans> updated pageview whitelist with din.wikipedia [analytics]
2017-07-11 §
05:24 <elukey> drop _Edit_11448630_old from dbstore1002 [analytics]
2017-07-10 §
16:14 <nuria_> deploying eventlogging 5e16da16e3f5ce287829390a76b9f5b0c7715ee5 [analytics]
2017-07-08 §
07:55 <elukey> re-run wikidata-specialentitydata_metrics-wf-2017-7-7 in Hue (failed Spark job) [analytics]
2017-07-06 §
10:37 <elukey> taking mysqldump for Piwik and storing it on stat1002:/a/backup/bohrium/mysqldump_20170706.sql [analytics]
2017-07-04 §
11:21 <joal> Redeploying refinery with scap [analytics]
11:10 <joal> Restart unique_devices-per_project_family-monthly-coord after correction deployed [analytics]
11:03 <joal> Deploying refinery onto hdfs [analytics]
10:57 <joal> Deploying refinery with scap [analytics]