analytics SAL

401-450 of 1245 results (15ms)

2017-08-14 §
16:40	<elukey>	analytics1034 back in service after swapping the eth cable - T172633	[analytics]
2017-08-10 §
20:06	<milimetric>	stopped Wikimetrics web and queue on wikimetrics-01.eqiad.wmflabs because the queue ran into errors connecting to the database (max 10 connections limit reached)	[analytics]
08:59	<elukey>	updated librdkafka1 to 0.9.4.1 on eventlog1001	[analytics]
2017-08-08 §
18:39	<elukey>	restart projectview-hourly-wf-2017-8-8-14, pageview-druid-hourly-wf-2017-8-8-14, pageview-hourly-wf-2017-8-8-14 via Hue (analytics1055 disk failure)	[analytics]
14:20	<elukey>	restart varnishkafka statsv/eventlogging instances to pick up https://gerrit.wikimedia.org/r/#/c/370637/ (kafka protocol explicitly set to 0.9.0.1)	[analytics]
2017-08-06 §
11:02	<elukey>	stop yarn on analytics1034 to reload the tg3 driver - T172633	[analytics]
2017-08-03 §
16:15	<ottomata>	druid cluster restarted with 0.9.2 mysql-metadata-storage extension, un-suspending oozie druid jobs	[analytics]
14:11	<ottomata>	pausing oozie druid jobs and doing a cluster upgrade/restart again to make sure updated version of mysql-metadata-storage jar is properly loaded	[analytics]
09:56	<elukey>	set piwik in maintenance mode to allow mysql updates	[analytics]
08:08	<elukey>	restarted Druid jobs failed over night (drud_loader.py error) and due to Hive metastore restart	[analytics]
08:03	<elukey>	restart hive-metastore to pick up new JVM Xms settings	[analytics]
2017-08-02 §
14:34	<ottomata>	beginning druid upgrade to 0.92 (take 2 :) )	[analytics]
14:23	<elukey>	restart hive-server to pick up JVM Xms4g change	[analytics]
14:22	<ottomata>	suspending druid oozie jobs	[analytics]
2017-08-01 §
18:57	<madhuvishy>	Bumped instance quota to 24 instances (nova quota-update analytics --instances 24)	[analytics]
17:24	<ottomata>	beginning druid upgrade to 0.9.2 http://druid.io/docs/0.9.2/operations/rolling-updates.html	[analytics]
17:10	<ottomata>	pausing all druid oozie coordinators	[analytics]
12:49	<elukey>	restart hive daemons on analytics1003 to pick up new jvm settings (bigger Xmx, JMX ports)	[analytics]
10:05	<elukey>	suspended again webrequest-load-bundle as prep step to restart the hive daemons	[analytics]
07:58	<elukey>	suspended webrequest-load-bundle as prep step to restart the hive daemons	[analytics]
07:03	<elukey>	restarted mobile_apps-session_metrics-coord-global-30days failed job via Hue	[analytics]
2017-07-31 §
13:45	<elukey>	suspended webrequest-load-bundle as prep step to restart hive metastore/server	[analytics]
10:34	<elukey>	restart hive-server on an1003 - beeline not connecting, thrift errors	[analytics]
2017-07-28 §
07:55	<elukey>	update nodejs to 6.11 on aqs1004 (testing prod node after beta qa)	[analytics]
07:54	<elukey>	re-run webrequest-load-wf-upload-2017-7-28-6 from Hue (was playing with eth0 issues on an1034)	[analytics]
02:08	<ottomata>	stat1002: disabled puppet, umounted /tmp, /home and /a, poweroff	[analytics]
2017-07-26 §
21:01	<mforns>	Deployed refinery using scap, then deployed onto hdfs	[analytics]
18:57	<mforns>	Deployed refinery-source using jenkins	[analytics]
2017-07-25 §
17:43	<bd808>	Forced puppet run on zk1-1.analytics.eqiad.wmflabs after elukey fixed hiera settings	[analytics]
17:34	<bd808>	Puppet broken on zk1-1.analytics.eqiad.wmflabs with "$clusters[$cluster_name] is :undef, not a hash or array at /etc/puppet/modules/profile/manifests/zookeeper/server.pp:22"	[analytics]
15:24	<elukey>	restart cassandra loading after maintenance via hue	[analytics]
13:06	<elukey>	stop cassandra load bundle, restarting AQS for jvm updates	[analytics]
12:13	<elukey>	executed sudo apt-get remove openjdk-8-jre openjdk-8-jre-headless on druid nodes	[analytics]
2017-07-24 §
14:24	<ottomata>	restarted mysql-eventbus eventlogging consumer with new consumer group	[analytics]
2017-07-20 §
20:31	<nuria_>	restaring eventlogging on eventlog1001	[analytics]
20:30	<nuria_>	deploying eventlogging c1c2c39411ccd002ff8cea197bc535155213f5fb and restarting	[analytics]
18:18	<ottomata>	deleted instance deployment-eventlogging03 in favor of new instance deployment-eventlog02	[analytics]
17:14	<ottomata>	killed tranquility instances tranq-banners and tranq-netflow running on druid1003 in joal's screen sessions	[analytics]
2017-07-18 §
13:04	<ottomata>	adding unique index on meta_id and index on meta_dt to mediawiki_page_{create,delete,move,undelete}_1 on db1046 MySQL eventlogging master	[analytics]
2017-07-17 §
16:27	<elukey>	set innodb_flush_log_at_trx_commit on bohrium to 2 and sync_binlog=300 to reduce iowait - T164073	[analytics]
14:31	<elukey>	set innodb_flush_log_at_trx_commit on bohrium to 1 (default value)- T164073	[analytics]
2017-07-12 §
13:48	<fdans>	updated pageview whitelist with din.wikipedia	[analytics]
2017-07-11 §
05:24	<elukey>	drop _Edit_11448630_old from dbstore1002	[analytics]
2017-07-10 §
16:14	<nuria_>	deploying eventlogging 5e16da16e3f5ce287829390a76b9f5b0c7715ee5	[analytics]
2017-07-08 §
07:55	<elukey>	re-run wikidata-specialentitydata_metrics-wf-2017-7-7 in Hue (failed Spark job)	[analytics]
2017-07-06 §
10:37	<elukey>	taking mysqldump for Piwik and storing it on stat1002:/a/backup/bohrium/mysqldump_20170706.sql	[analytics]
2017-07-04 §
11:21	<joal>	Redeploying refinery with scap	[analytics]
11:10	<joal>	Restart unique_devices-per_project_family-monthly-coord after correction deployed	[analytics]
11:03	<joal>	Deploying refinery onto hdfs	[analytics]
10:57	<joal>	Deploying refinery with scap	[analytics]