analytics SAL

701-750 of 3651 results (20ms)

2020-09-15 §
12:30	<elukey>	stop timers on an-launcher1002 to drain the cluster and restart an-coord1001's daemons (hive/oozie/presto)	[analytics]
06:48	<elukey>	run systemctl reset-failed monitor_refine_eventlogging_legacy_failure_flags.service on an-launcher1002	[analytics]
2020-09-14 §
14:36	<milimetric>	deployed eventstreams with new KafkaSSE version on staging, eqiad, codfw	[analytics]
2020-09-11 §
15:41	<milimetric>	restarted data quality stats bundles	[analytics]
01:32	<milimetric>	deployed small fix for hql of editors_bycountry load job	[analytics]
00:46	<milimetric>	deployed refinery source 0.0.136, refinery, and synced to HDFS	[analytics]
2020-09-09 §
10:11	<klausman>	Rebooting stat1005 for clearing GPU status and testing new DKMS driver (T260442)	[analytics]
07:25	<elukey>	restart varnishkafka-webrequest on cp5010 and cp5012, delivery reports errors happening since yesterday's network outage	[analytics]
2020-09-04 §
18:11	<milimetric>	aqs deploy went well! Geoeditors endpoint is live internally, data load job was successful, will submit pull request for public endpoint.	[analytics]
06:54	<joal>	Manually restart mediawiki-history-drop-snapshot after hive-partitions/hdfs-folders mismatch fix	[analytics]
06:08	<elukey>	reset-failed mediawiki-history-drop-snapshot on an-launcher1002 to clear icinga errors	[analytics]
01:52	<milimetric>	aborted aqs deploy due to cassandra error	[analytics]
2020-09-03 §
19:15	<milimetric>	finished deploying refinery and refinery-source, restarting jobs now	[analytics]
13:59	<milimetric>	edit-hourly-druid-wf-2020-08 fails consistently	[analytics]
13:56	<joal>	Kill-restart mediawiki-history-reduced oozie job into production queue	[analytics]
13:56	<joal>	rerun edit-hourly-druid-wf-2020-08 after failed attempt	[analytics]
2020-09-02 §
18:24	<milimetric>	restarting mediawiki history denormalize coordinator in production queue, due to failed 2020-08 run	[analytics]
08:37	<elukey>	run kafka preferred-replica-election on jumbo after jumbo1003's reimage to buster	[analytics]
2020-08-31 §
13:43	<elukey>	run kafka preferred-replica-election on Jumbo after jumbo1001's reimage	[analytics]
07:13	<elukey>	run kafka preferred-replica-election on Jumbo after jumbo1005's reimage	[analytics]
2020-08-28 §
14:25	<mforns>	deployed pageview whitelist with new wiki: ja.wikivoyage	[analytics]
14:18	<elukey>	run kafka preferred-replica-election on jumbo after the reimage of jumbo1006	[analytics]
07:21	<joal>	Manually add ja.wikivoyage to pageview allowlist to prevent alerts	[analytics]
2020-08-27 §
19:05	<mforns>	finished refinery deploy (ref v0.0.134)	[analytics]
18:41	<mforns>	starting refinery deploy (ref v0.0.134)	[analytics]
18:30	<mforns>	deployed refinery-source v0.0.134	[analytics]
13:29	<elukey>	restart jvm daemons on analytics1042, aqs1004, kafka-jumbo1001 to pick up new openjdk upgrades (canaries)	[analytics]
2020-08-25 §
15:47	<elukey>	restart mariadb@analytics_meta on db1108 to apply a replication filter (exclude superset_staging database from replication)	[analytics]
06:35	<elukey>	restart mediawiki-history-drop-snapshot on an-launcher1002 to check that it works	[analytics]
2020-08-24 §
06:50	<joal>	Dropping wikitext-history snapshots 2020-04 and 2020-05 keeping two (2020-06 and 2020-07) to free space in hdfs	[analytics]
2020-08-23 §
19:34	<nuria>	deleted 1.2 TB from hdfs://analytics-hadoop/user/analytics/.Trash/200811000000	[analytics]
19:31	<nuria>	deleted 1.2 TB from hdfs://analytics-hadoop/user/nuria/.Trash/*	[analytics]
19:26	<nuria>	deleted 300G from hdfs://analytics-hadoop/user/analytics/.Trash/200814000000	[analytics]
19:25	<nuria>	deleted 1.2 TB from hdfs://analytics-hadoop/user/analytics/.Trash/200808000000	[analytics]
2020-08-20 §
16:49	<joal>	Kill restart webrequest-load bundle to move it to production queue	[analytics]
2020-08-14 §
09:13	<fdans>	restarting refine to apply T257860	[analytics]
2020-08-13 §
16:13	<fdans>	restarting webrequest bundle	[analytics]
14:44	<fdans>	deploying refinery	[analytics]
14:13	<fdans>	updating refinery source symlinks	[analytics]
2020-08-11 §
17:36	<ottomata>	refine with refinery-source 0.0.132 and merge_with_hive_schema_before_read=true - T255818	[analytics]
14:52	<ottomata>	scap deploy refinery to an-launcher1002 to get camus wrapper script changes	[analytics]
2020-08-06 §
14:47	<fdans>	deploying refinery	[analytics]
08:07	<elukey>	roll restart druid-brokers (on both clusters) to pick up new changes for monitorings	[analytics]
2020-08-05 §
13:04	<elukey>	restart yarn resource managers on an-master100[12] to pick up new Yarn settings - https://gerrit.wikimedia.org/r/c/operations/puppet/+/618529	[analytics]
13:03	<elukey>	set yarn_scheduler_minimum_allocation_mb = 1 (was zero) to Hadoop to workaround a Flink 1.1 issue (namely it doesn't work if the value is <= 0)	[analytics]
09:32	<elukey>	set ticket max renewable lifetime to 7d on all kerberos clients (was zero, the default)	[analytics]
2020-08-04 §
08:30	<elukey>	resume druid-related oozie coordinator jobs via Hue (after druid upgrade)	[analytics]
08:28	<elukey>	started netflow kafka supervisor on Druid Analytics (after upgrade)	[analytics]
08:19	<elukey>	restore systemd timers for druid jobs on an-launcher1002 (after druid upgrade)	[analytics]
07:33	<elukey>	stop systemd timers related to druid on an-launcher1002	[analytics]