3151-3200 of 4795 results (28ms)
2018-11-18 §
08:37 <elukey> restart yarn on analytics1039 - not clear why the process failed (nothing in the logs, no other disks failed) [analytics]
2018-11-15 §
14:51 <fdans> testing load of new uniques fields in test keyspace in cassandra [analytics]
14:07 <elukey> re-run mediacounts-load-wf-2018-11-15-8 - died due to issues on an1039 (happened this morning, broken disk) [analytics]
2018-11-12 §
19:30 <ottomata> running oozie-setup sharelib create and then spark2_oozie_sharelib_install [analytics]
15:40 <fdans> Restarting per project family unique generation jobs (daily and monthly) [analytics]
13:18 <joal> Suspend discovery 0060527-180705103628398-oozie-oozi-C coordinator for it not to block upgrade [analytics]
2018-11-05 §
10:20 <joal> Create hive tables wmf.webrequest_subset and wmf.webrequest_subset_tags [analytics]
10:02 <joal> Start mediawiki-history-wikitext job [analytics]
09:58 <joal> create wmf.mediawiki_wikitext_history table [analytics]
09:46 <joal> Alter wmf.pageview_whitelist renaming insertion_ts field to insertion_dt for convention [analytics]
09:43 <joal> restart mediawiki-load oozie bundle to pick new deploy [analytics]
09:39 <joal> Restart mediawiki-history-load oozie job to pick new deploy [analytics]
09:37 <joal> Create table wmf_raw.mediawiki_change_tag [analytics]
09:24 <joal> deploying refinery onto HDFSb [analytics]
09:04 <joal> Deploy refinery from scap [analytics]
08:55 <joal> Refinery-source released on archiva [analytics]
2018-10-30 §
16:55 <mforns> Finished AQS deployment using scap [analytics]
16:45 <mforns> Starting AQS deployment using scap [analytics]
15:34 <ottomata> kafka topics --alter --topic eventlogging_VirtualPageView --partitions 12 [analytics]
2018-10-29 §
22:55 <ottomata> groceryheist killed a long running hive query that is now allowing backlogged production yarn jobs to finally execute [analytics]
16:37 <ottomata> reassigning eventlogging_ReadingDepth partition 0 from 1002,1004,1006 to 1003,1001,1005 to move preferred leadership from 1002 to 1003 [analytics]
14:27 <ottomata> ran kafka-preferred-replica-election on kafka jumbo-eqiad cluster (this successfully rebalanced webrequest_text partition leadership) T207768 [analytics]
10:23 <joal> Kill yarn application application_1540747790951_1429 to prevent more cluster errors (eating too many resources) [analytics]
08:56 <elukey> bounce yarn resource managers to pick up new zookeeper session timeout settings [analytics]
2018-10-28 §
17:30 <elukey> restart yarn resource manager on an-master1002 to force failover to an-master1001 [analytics]
2018-10-26 §
18:33 <andrewbogott> region migration finished [analytics]
13:36 <andrewbogott> migrating project to eqiad1 [analytics]
11:49 <joal> Rerun failed oozie jobs (pageview and projectview) [analytics]
06:18 <elukey> add AAAA DNS records for aqs and matomo1001 [analytics]
05:55 <elukey> reportupdater hadoop migrated to stat1007 [analytics]
2018-10-25 §
21:06 <ottomata> bouncing eventlogging-processor client side* to pick up mysql whitelist change for ContentTranslationAbuseFilter (https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/469419/) [analytics]
18:14 <joal> Manually resume the bunch of suspended jobs (mostly from ebernhardson and chelsyx - our apologizes for not noticing earlier) [analytics]
18:13 <joal> Manually copy /etc/hive/conf/hive-site.xml to hdfs:///user/hive and set permissions to 644 to allow all users to run oozie jobs [analytics]
15:36 <elukey> shutdown aqs1006 to replace one broken disk [analytics]
14:28 <elukey> upgrade druid on druid100[4-6] to Druid 0.12.3 [analytics]
14:24 <elukey> added AAAA DNS records to all the druid nodes [analytics]
10:36 <joal> Resuming oozie webrequest and pageview druid hourly indexation jobs [analytics]
10:35 <elukey> upgraded Druid on druid100[1-3] to 0.12.3-1 [analytics]
09:16 <elukey> upgrade turnilo to 1.8.1 [analytics]
08:56 <elukey> restart hive-server on an-coord1001 to pick up new prometheus settings [analytics]
08:10 <joal> Suspend webrequest-druid-hourly and pageview-druid-hourly oozie jobs [analytics]
07:52 <joal> Manually add za.wikimedia to pageview-witelist (patch merged: https://gerrit.wikimedia.org/r/469557) [analytics]
2018-10-23 §
16:25 <ottomata> altering topic eventlogging_ReadingDepth to increase partitions from 1 to 12 [analytics]
06:42 <elukey> restart yarn and hdfs daemon on analytics1068 to pick up correct config (the host was down since before we swapped the Hadoop masters due to hw failures) [analytics]
2018-10-22 §
17:24 <elukey> upgraded camus jar version in an-coordq1001's crontab (via puppet) [analytics]
17:21 <elukey> deploy refinery to hdfs (via stat1005) [analytics]
17:12 <elukey> deploy refinery (new version of camus) [analytics]
15:09 <mforns> Finished deployment of refinery using scap and refinery-deploy-to-hdfs [analytics]
14:51 <mforns> Starting deployment of refinery using scap and refinery-deploy-to-hdfs [analytics]
14:50 <mforns> Finished deployment of refinery-source using jenkins [analytics]