Questions tagged [hortonworks-data-platform]

1

votes
2

answer
138

Views

How to wait for GenerateTableFetch queries to finish

My use case is like this. I have some X tables to be pulled from MySQL. I am splitting them using SplitText to put each table in a individual flow file and pull using GenerateTableFetch and ExecuteSQL. And I want to be notified or put some other action when import is done for all the tables. At Spl...
pratpor
1

votes
2

answer
1.1k

Views

Hive View Not Opening

In the Ambari UI of the hortonworks sandbox, I was trying to open Hive View through the account of maria_dev. But however, I was getting the following error: Service Hive check failed: Cannot open a hive connection with connect string jdbc:hive2://sandbox-hdp.hortonworks.com:2181/;serviceDiscovery...
Witty Counsel
1

votes
1

answer
1.2k

Views

Can not connect to ZooKeeper/Hive from host to Sandbox Hortonworks HDP VM

I downloaded HDP-Sandbox (in an Oracle VirtualBox VM) a while ago, never used it much, and I’m now trying to access data from the outside world using Hive HDBC. I use hive-jdbc 1.2.2 from apache, which I got from mvnrepository, with all the dependencies in the classpath, or hortonworks JDBC got fr...
Sxilderik
1

votes
1

answer
815

Views

Get HDP version through Spark

We installed a new Spark version so all folders name are named similar to: ls /etc/hadoop/ 2.6.4.0-91 conf conf.backup and from spark-submit we get spark-submit --version Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.2.0.2.6.4...
enodmilvado
1

votes
1

answer
772

Views

hive is not working with HDP with Ambari

I have installed the HDP with Ambari 2.6.1. It mostly did everything automatically but Hive is unable to start properly. I saw a post somewhere and I deleted the pid and killed the process as well in hope that I would restart it and it would work but now it's showing heartbeat lost on the machine. P...
Nikita Patil
1

votes
1

answer
90

Views

How to create a script that stops Hive service on my cluster?

I need to create a script shell that stops the Hive Metastore and HiveServer2 service from any node of cluster and I need to know where Hive services are hosted in my cluster in order to run this command: ssh nodename:ps aux | awk '{print $1,$2}' | grep hive | awk '{print $2}' | xargs kill >/dev/nul...
HISI
1

votes
0

answer
307

Views

install hadoop_2_6_1_0_129-hdfs

tried to install Hadoop cluster OS Red Hat Enterprise Linux Server release 7.4 (Maipo) Ambari Version 2.5.1.0 HDP 2.6 App Timeline Server Install returned error: 2018-02-26 19:31:49,406 - Installing package hadoop_2_6_1_0_129-hdfs ('/usr/bin/yum -d 0 -e 0 -y install hadoop_2_6_1_0_129-hdfs') 2018-02...
Nikolay Baranenko
1

votes
0

answer
47

Views

Phoenix on Hbase wrong results when region server is down

I have a 4 node cluster. When one of the node is down i see that HBase is running some compactions. During this time if i execute Apache phoenix queries it is giving me wrong results. Once the compactions are completed it is giving me correct results. My replication factor is 3. I am using HDP 2.6....
dsr301
1

votes
0

answer
257

Views

How concatenate column varchar values in group by?

I want to do a select where for one column, I concatenate the values. For example, if I have the rows: ID NAME Friend 1 Joe Fred 2 Jeff Fred 3 Joe Jack 4 Joe Sally And I grouped by name, I would get: Joe Fred,Jack,Sally Jeff Fred I...
Don Rhummy
1

votes
1

answer
121

Views

Dynamically Import XML files to HIVE

How to create HIVE table from XML file , with only few specific fields?? For example I have a XML file of 1000 fields but I need only 100 fields in my HIVE table. And apart from that. How do I store the 100 fields in different databases and different tables?
Harish
1

votes
0

answer
468

Views

Can't connect to Hive via JDBC using Zookeeper connection string

val driverClassName = 'org.apache.hive.jdbc.HiveDriver'; Class.forName(driverClassName); // The below connection string works // val jdbcUrl = String.format('jdbc:hive2://sandbox-hdp.hortonworks.com:10000/%s', database); // The next one doesn't... val jdbcUrl = String.format('jdbc:hive2://sandbox-hd...
1

votes
1

answer
1.6k

Views

How to install libraries to python in zeppelin-spark2 in HDP

I am using HDP Version: 2.6.4 Can you provide a step by step instructions on how to install libraries to the following python directory under spark2 ? The sc.version (spark version) returns res0: String = 2.2.0.2.6.4.0-91 The spark2 interpreter name and value is as following zeppelin.pyspark.python...
1

votes
0

answer
262

Views

Azure. Connection to 40.117.211.214 closed by remote host

I am currently trying to access the VM on Azure for my Big Data course. I have already set up HortonWorks on Azure. When I try to access the VM from my Ubuntu command line with my password, I get the following error(s):' [email protected]:~$ ssh [email protected] [email protected]
Alexander Brown
1

votes
0

answer
49

Views

Cloudera to HDP SOLR(version 5.5.2) Data Migration | Failed to Update solr indexes after restoration on solr cloud

SOLR version - 5.5.2 My Project requirement is to transfer solr cloud indexes from cloudera cluster to HDP cluster. Data is huge(1 billion indexed records on production), hence re-indexing is not an option. We have tried solr restore and backup APIs but data is not visible on cloud. Please check i...
Prachi Singh
1

votes
0

answer
212

Views

Connectivity issue with Kerberized HBase via Java application running outside HDP Cluster

We have a java application running on the Liberty IBM WebSphere server and trying to connect to the HBase on the HDP cluster to persist some data. Now we are facing issues to connect to HBase(kerberized) on HDP cluster. We have been able to connect to HBase via Spark, Storm or application running...
Puneet Babbar
1

votes
1

answer
256

Views

Spark unable to read kafka Topic and gives error “ unable to connect to zookeeper server within timeout 6000”

I'm trying to Execute the Example program given in Spark Directory on HDP cluster '/spark2/examples/src/main/python/streaming/kafka_wordcount.py' which tries to read kafka topic but gives Zookeeper server timeout error. Spark is installed on HDP Cluster and Kafka is running on HDF Cluster, both ar...
Mahesh
1

votes
1

answer
464

Views

JDBC hive connection error in beeline through knox

I'm a newbie to hdp and knox. My HDP environment description: HDP version - 2.6 HS2 is enabled Hive transport mode - HTTP Knox installed via ambari SSL is not enabled non Kerberized instance Issue: I'm trying to connect to HIVE via beeline. The connection string is '!connect jdbc:hive2://:8443/;tran...
1

votes
0

answer
114

Views

HDP's HDFS replication process really slow

I am currently working with both CDH and HDP. My CDH system's replication process works very well but HDP doesn't For example: When I set the replication factor for large directory in HDFS (20TB) to 2, HDFS need to delete 2 millions blocks When I set again the replication factor for above directory...
Ha Pham
1

votes
1

answer
159

Views

Nifi putHiveStreaming Failed to connect to metastore uri

I'm facing issues with putHiveStreaming Processor as it is not connecting to hive metastore. I am using kylo-cloudera-sandbox-0.9.1, please help me on this as I'm not able to figure out the issue.
Karthik Mannava
1

votes
1

answer
204

Views

Unable to register AWS host to Ambari server

While registering a host to the cluster of Ambari-server, I am getting the following error. 'Host checks were skipped on 1 hosts that failed to register.' I'm trying to install HDP 2.5 version on the instance of AWS. I have tried to follow the documentation of Hortonworks. https://docs.hortonworks.c...
Deya
1

votes
1

answer
92

Views

ACLs not supported on at least one file system: Distcp HDFS

As per distcp documentation -> If -pa is specified, DistCp preserves the permissions also because ACLs are a super-set of permissions. but hadoop distcp -pa -delete -update /src/path /dest/path/ is failing with ACLs not supported on at least one file system. Complete logs below The above command e...
satish sidnakoppa
1

votes
3

answer
94

Views

Decreasing max.spout.pending value leads to failed messages in Kafka Spout in Storm UI?

We are trying to benchmark the performance in our Storm Topology. We are ingesting messages around 1000/seconds to Kafka Topic. When we put max.spout.pendind=2000 in our KafkaSpout then we don't see any failed messages in storm UI but when we decrease the max.spout.pendind value to 500 or 100, then...
Satyajit Das
1

votes
0

answer
40

Views

hive3 - hiveserver2 process crashes within 2minutes

Running hiveserver2 process (v3.0.0 of hive) on ec2 (not emr), the process starts up and for the first 1min everything is ok (I can make beeline connection, create/repair/select external hive tables) but then the hiveserver2 process crashes. If I restart the process and even do nothing the hiveserve...
tooptoop4
1

votes
0

answer
56

Views

Using unicode in CSVRecordSetWriter escape character

Is is possible to use a unicode character in CSVRecordSetWriter controller service Escape Character? I used '\u0003' as delimiter and it didn't throw any error but on using '\u0004' as Escape Character, it throws error. Update On the same lines, CSVReader controller service doesn't have Record Separ...
pratpor
1

votes
0

answer
17

Views

How to get and build only storm tar from hdp?

I want to get only storm tar from hdp rather than getting the entire bundle which includes hadoop tar and many other tars. Is there any way to get only storm tar? The only way I have found till now is to clone the repository https://github.com/hortonworks/storm-release/tree/HDP-2.6.5.3002-10-tag and...
naksh9619
1

votes
1

answer
241

Views

How to create parquet table in Hive 3.1 through Spark 2.3 (pyspark)

Facing issues while creating/loading parquet table from Spark Environment details: Horotonworks HDP3.0 Spark 2.3.1 Hive 3.1 1#. When trying to create parquet table in Hive 3.1 through Spark 2.3, Spark throws below error. df.write.format('parquet').mode('overwrite').saveAsTable('database_name.test1')...
XEngineer
1

votes
0

answer
108

Views

Hortonworks HDP 3 : Error starting ResourceManager

I have installed a new cluster HDP 3 using ambari 2.7. The problem is that resource manager service is not starting. I get the following error: Traceback (most recent call last): File '/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/resourcemanager.py', line 275, in Res...
Prashant Gupta
1

votes
0

answer
526

Views

Tez VS Spark - huge performance diffs

I'm using HDP 2.6.4 and am seeing huge differences in Spark SQL vs Hive on TeZ. Here's a simple query on a table of ~95 M rows SELECT DT, Sum(1) from mydata GROUP BY DT DT is partition column, a string that marks date. In spark shell, with 15 executors, 10G memory for driver and 15G for executor, qu...
hummingBird
1

votes
1

answer
30

Views

Hortonworks HDP How to set up a Kerberos enabled Kafka

I have recently downloaded Hortonworks HDP VM. I am able to run Kafka on it. I can produce/consume messages through security-protocol=PLAINTEXT. However, I now want to consume through security-protocol=SASL_PLAINTEXT and Kerberos. I know that I can setup, SASL_PLAINTEXT through Ambari (Screenshot a...
Fawad Shah
1

votes
0

answer
71

Views

How to convert a Hive (ORC format) table to a TFRecord file and store in HDFS?

I’m working on deploying a TensorFlow model to production and would like to update my data pipeline to consume TFRecord file(s) with tf.data. How can I convert a Hive table stored in ORC format (Hortonworks distribution) into a TFRecord file format that is stored in HDFS?
user10859416
1

votes
0

answer
21

Views

How to reduce turn around time of Livy

In my application I submit spark job through livy and get back the result by uploading the jar file every time to the cluster, but the problem is it takes 20 seconds to give the results back. Is there any way I could reduce the time taken by livy server to give back the job results ? Kindly shed so...
Vignesh
1

votes
0

answer
18

Views

How to connect putHQL in apache nifi

I am trying to connect to hive on hdinsight . these are the steps that i have followed 1.created a text file and inserted a create table statement 2.converted it to flowfile using getFile processor. This flow file will be input to puthiveQL processor when i try to execute this dataflow it throws err...
siddharth kotkar
1

votes
1

answer
1.3k

Views

Connecting to HDP 2.0 (Hortonworks Hadoop) with yarn client

I downloaded and launched HDP 2.0 in VirtualBox and then tried to connect from Java using YarnClient YarnClient client = YarnClient.createYarnClient(); client.init(new Configuration()); client.start(); client.createApplication(); But came across the following error: 1311 [IPC Client (1943692956) c...
Genadii Ganebnyi
1

votes
2

answer
965

Views

Hortonworks HDP2.0 + giraph

I have hortonworks HDP2.0 running in sandbox (recently installed) at Windows 8.1 platform. I need to learn how to get giraph working with HDP 2.0,. I think, giraph is not currently installed with HDP 2.0 bydefault. Can someone help me installing giraph as well as point me to some sources on hands-on...
Varun Gupta
1

votes
1

answer
297

Views

Is there any way how to open file from virtual machine's hdfs system in the Windows env?

Maybe my question is a bit stupid, but I want to access hdfs file on host Windows environment, specifically, in eclipse. Hadoop and all related stuff is installed on VirtualBox (used Hortonworks Sandbox env. With Centos OS). On virtual machine I can work with hdfs without issues, tried to access hdf...
elkoo
1

votes
1

answer
5.5k

Views

How to create an ORC file in Hive CDH?

I can easily create an ORC file format in Apache Hadoop or Hortonworks' HDP: CREATE TABLE ... STORED AS ORC However this doesn't work in Cloudera's CDH 4.5. (Surprise!) I get: FAILED: SemanticException Unrecognized file format in STORED AS clause: ORC So as an alternative, I tried to download and in...
matthieu lieber
1

votes
2

answer
4.2k

Views

Select statement error with Application exitCode 1

Am working on Hortonworks Hive. I have seen same type of errors. But underlying MapReduce error seems to be different here in the case as Application error with exitCode 1. In Hive, the statement Select * from SomeTable; ...Is working fine, but Select colName from SomeTable; ...Is not working. Appli...
sio2deep
1

votes
1

answer
2.4k

Views

Hortonworks Data node install: Exception in secureMain

Am trying to install Hortonworks Hadoop single node cluster. I am able to start namenode and secondary namenode, but datanode failed with the following error. How do I solve this issue? 2014-04-04 18:22:49,975 FATAL datanode.DataNode (DataNode.java:secureMain(1841)) - Exception in secureMain java.la...
user3350280
1

votes
1

answer
362

Views

hCatalog page gives error

I am using HortonWorks sandbox to try out few samples. Following page is displaying 'Error' on UI (Time Out) http://:8000/hcatalog/ Detailed Server logs: [25/Apr/2014 13:07:49 +0000] middleware INFO Processing exception: timed out (code THRIFTSOCKET): None: Traceback (most recent call last): File...
Madhup Srivastava
1

votes
2

answer
2.8k

Views

Sqoop job through oozie

I have created a sqoop job called TeamMemsImportJob which basically pulls data from sql server into hive. I can execute the sqoop job through the unix command line by running the following command: sqoop job –exec TeamMemsImportJob If I create an oozie job with the actual scoop import command in i...
Colman

View additional questions