Questions tagged [hiveql]

1

votes
1

answer
57

Views

why boolean field is not working in Hive?

I have a column in my hive table which datatype is boolean. when I tried to import data from csv, it stored as NULL. This is my sample table : CREATE tABLE if not exists Engineanalysis( EngineModel String, EnginePartNo String , Location String, Position String, InspectionReq boolean) ROW FORMAT DELI...
1

votes
2

answer
85

Views

How to identify repeated occurrences of a string column in Hive?

I have a view like this in Hive: id sequencenumber appname 242539622 1 A 242539622 2 A 242539622 3 A 242539622 4 B 242539622 5 B 242539622 6 C 242539622...
Isaac
0

votes
0

answer
3

Views

Hive to escape Null or blank strings with contact_ws

is there a way to escape null seperator whileusing contact_ws. I have a data that is populating lik ,20000 and I want to remove coma for the single population. Eg: ID value 1 AAA 1 BBBB 2 2 CCCC 3 AAA 4 CCCD 4 DEDED 4 Current Result: After using contact_ws with , as seperator and c...
Seshi Kumar
1

votes
0

answer
6

Views

Oozie solution to execute a query and get results back from sql & Hive

I am trying to solve the below problem using oozie. Any suggestions about solution are much appreciated. Back ground : I had developed a code to import data from SQL database using (oozie - Sqoop import) and done some transformation and loaded the data to Hive. Now I need to do a count check bet...
vinu.m.19
1

votes
1

answer
5k

Views

How to compute the intersections and unions of two arrays in Hive?

For example, the intersection select intersect(array('A','B'), array('B','C')) should return ['B'] and the union select union(array('A','B'), array('B','C')) should return ['A','B','C'] What's the best way to make this in Hive? I have checked the hive documentation, but cannot find any relevant info...
Osiris
1

votes
1

answer
22

Views

Merging records by 1+ common elements

I have a hive table with the following schema: int key1 # this is unique array key2_list Now I want to merge records here if their key2_lists have any common element(s). For example, if record A has (10, [1,3,5]) and B has (12, [1,2,4]), I want to merge it as ([10,12], [1,2,3,4,5]) or ([1,2,3,4,5])....
kee
1

votes
0

answer
466

Views

How fix HIVE_CURSOR_ERROR on several columns in athena

I am trying to execute the following select statement in aws athena: SELECT col_1, col_2 FROM 'my_database'.'my_table' WHERE partition_1='20171130' AND partition_2='Y' LIMIT 10 And I got en error: Your query has the following error(s): HIVE_CURSOR_ERROR: Can not read value at 0 in block 0 in file s3...
Cherry
1

votes
1

answer
26

Views

How to compare two hive table records

I have two tables Core and BKP. Table BKP contains data with duplicates.But Core contains the same data without duplicates. primary key for two tables is combination of 4 fields(1,2,3,4). But after running some script some records from Core table missed. How can I find out missed records from Core t...
Lekshmi
1

votes
1

answer
30

Views

Initcap of word

I'm having a table x it contain the column resource_name in this column I'm having data like NASRI(SRI). I'm applying initcap on this column it's giving output Nasri(sri). But my expected output is Nasri(Sri). How I can achieve the desired result? Thank you
Naseer Sd
1

votes
0

answer
69

Views

Pivot a table in HiveSQL - not a key/value pair situation

I need to pivot a long table using Hive SQL. The table looks like this: and I want it to look like this: where N is some user-defined cutoff. I found examples of how to do this when the original table contains columns of id, keys, and values, but nothing where it's just id's and values and there's...
user3490622
1

votes
1

answer
70

Views

How to aggregate event for denormalization?

A user clickstream is represented by events with type and event_timestamp properties. For example: userid type event_timestamp (yyyy-MM-ddThh:mm:ss.SSS) 01 install 2018-01-01T00:00:00.000 01 level_up 2018-01-15T00:00:00.000 01 new_item 2018-02-03T00:00:00.000 All inp...
Cherry
1

votes
2

answer
2k

Views

Hive - SELECT inside WHEN clause of CASE function gives an error

I am trying to write a query in Hive with a Case statement in which the condition depends on one of the values in the current row (whether or not it is equal to its predecessor). I want to evaluate it on the fly, this way, therefore requiring a nested query, not by making it another column first and...
Blew my stack
1

votes
0

answer
322

Views

Variable substitution in hive

can you please help me with 'define' namespace in hive(2.2.0)? Below is what am I doing : $ hive -d foo=eg_test; hive>set foo; foo=eg_test hive>select * from ${foo}; OK Time taken: 5.13 seconds hive> set hivevar:foo; hivevar:foo=eg_test It seems that 'define' namespace by default initializes the var...
Debasish Dutta
1

votes
0

answer
273

Views

Is hive.groupby.skewindata depend on hive.optimize.skewjoin?

According to hive template : hive.optimize.skewjoin : Whether to enable skew join optimization. The algorithm is as follows: At runtime, detect the keys with a large skew. Instead of processing those keys, store them temporarily in an HDFS directory. In a follow-up map-reduce job, process those skew...
Ashish Doneriya
1

votes
0

answer
170

Views

Spark 2.2.1 HQL to ALTER Hive Table with partitions fails with InvalidOperationExeception

I have an application where I'm sending HQL using SparkSession.sql() method. First I create a table with parititions CREATE TABLE table_name (Id BigInt) PARTITIONED BY(Age BigInt) After this I have following ALTER table statement as follows : ALTER TABLE table_name ADD COLUMNS(Name String) The ALTE...
kaysush
1

votes
1

answer
206

Views

Query metadata from HIVE using MySQL as metastore

I am looking for a way to query the metadata of my HIVE data with a HiveQL command. I configured a MySQL metastore, but it is necessary to query the metadata via HIVE command because then I want to access the data with ODBC connection to the HIVE system. Thanks in advanced.
E. Lueken
1

votes
1

answer
147

Views

Getting Error when Trying to Drop Database

I am stuck with an issue. I created a External hive table with a wrong HDFS path and then I populated the Data in HDFS Now I am trying to Drop the table and Getting below error 18/02/15 08:35:02 [HiveServer2-Background-Pool: Thread-54]: ]: ERROR exec.DDLTask: org.apache.hadoop.hive.ql.metadata.Hive...
Anupam Alok
1

votes
2

answer
30

Views

How to return the groups from a double group by that have all categories on them in HiveQL

I have this code in Hiveql and I want to return only the groups that have both Female and Male select first_name, gender, count(*) from attributes group by first_name, gender for example name gender count MICHAEL FEMALE 10000 MALE 11200 and not: name gender count BILLY MALE...
Billy Bonaros
1

votes
0

answer
314

Views

Loading json data into hive tables using spark sql

I am trying to load a dataframe into json data Here is my sample data import org.apache.spark.sql._ import org.apache.spark.sql.types._ import org.apache.spark.sql.functions.lit val df = Seq((2012, 8, 'Batman', 9.8), (2012, 8, 'Hero', 8.7), (2012, 7, 'Robot', 5.5), (2011, 7, 'Git', 2.0)).toDF('year'...
Srinivas
1

votes
1

answer
96

Views

How to give hive query into sperate file for each query?

I have a .sql file in which there are 100s of hive queries and i want their output in a multiple files, like for 1st query abc.txt file gets created for 2nd query xyz.txt file gets created and so on....for 100 queries 100 output file with their result respectively
user9185088
1

votes
1

answer
100

Views

hive scan and select in one query

I have a hive table, say emp_details(name, dept). In this table, I need to check if any records exists with dept = ‘a’ then select those records. If no such record is found then only I will choose records with dept = ‘b’. The source data has either 'a' or 'b' as dept value and my result set...
Sandeep
1

votes
2

answer
1.2k

Views

Hive column casting from decimal to double is resulting in NULL

I have hive table table1 schema of the table looks like this [CREATE TABLE table1(p_decimal1 DECIMAL(38,5)) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE] and i have below value in the table row : col(p_decimal1) row1 : 12345123451234512345123.45123 in later stage if i e...
user2862709
1

votes
1

answer
147

Views

Removing string characters from multiple column

I have a table like this: YSQ YSQR ys Y 12 12 55 11 abc 22 qrs 2 # def @ aaa I need to remove all String characters and special characters from all columns in a single hive query which would look like this: YSQ YSQR ys Y 12 12...
Naveed Navaz
1

votes
1

answer
1.4k

Views

Hive - Select count(*) not working with Tez with but works with MR

I have a Hive external table with parquet data. When I run select count(*) from table1, it fails with Tez. But when execution engine is changed to MR it works. Any idea why it's failing with Tez? I'm getting the following error with Tez: Error: org.apache.hive.service.cli.HiveSQLException: Error w...
kunrazor
1

votes
1

answer
537

Views

Hive - insert into table partition throwing error

I am trying to create a partitioned table in Hive on spark and load it with data available in other table in Hive. I am getting following error while loading the data: Error: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.Table.ValidationFailureSemanticException: Partiti...
kapil
1

votes
0

answer
46

Views

Safely set a STRING variable in hive?

Problem I can set variables in hive at run time: $ cat my_query.sql select '${hiveconf:my_string_variable}' $ hive -f my_query.sql -hiveconf my_string_variable=foobar (returns) foobar But if I run the query and forget to set the variable (no -hiveconf argument), hive treats the call signature as a s...
combinatorist
1

votes
1

answer
33

Views

SQL window excluding current group?

I'm trying to provide rolled up summaries of the following data including only the group in question as well as excluding the group. I think this can be done with a window function, but I'm having problems with getting the syntax down (in my case Hive SQL). I want the following data to be aggregate...
Semaj
1

votes
1

answer
34

Views

how to remove matched data using hive query? [closed]

I have 2 tables. table 'S' and table 'A'. I need to remove data that present in A table from S table using hive query. Is there any method for this?
ambiga
1

votes
0

answer
343

Views

Cross Product In HIVE

While Running hive query on map reduce my job is stuck at a particular stage.I don't have any idea why it's running very slow. I can't put the whole query but will post just part of it.I have already table called TICKET_V and TICKET_R. Now my query is following... INSERT OVERWRITE TABLE TICKET_V SEL...
sshah
1

votes
4

answer
70

Views

A HiveQL query with an output for each one hour of the day

I want to write a HiveQL query that returns a number of equipements each time an event /live//activate occurs and that for each one hour of the day. Here is how my table looks like: The issue is that I have to change and rewrite my query 24 times according to each intervall of one hour. For examp...
Iriel
1

votes
1

answer
254

Views

Finding the First & Last of Array struct

Having an array struct in file like below [{'A':'1','B':'2','C':'3'},{'A':'4','B':'5','C':'6'},{'A':'7','B':'8','C':'9'}] How can I get the first & last value of column 'A' ('1','7') Need to write in Hive SQL. Thanks in advance.
sidhartha swain
1

votes
0

answer
84

Views

Hive Error- while copying data from one DB table to another DB table

I want to copy data from one DB table to another DB table using hive on EMR. Below is the HQL using which I'm copying data along with the date partition. insert into Target.exttbl_user_identification_details PARTITION(load_date='2018-04-23') select * from Source.exttbl_user_identification_details;...
Ganesh
1

votes
1

answer
373

Views

How to check whether a partition exists with hive

I have a HiveQL script that can do some operations based on a hive table. But before doing these operations, I will check whether the partition needed exists, and if not, I will terminate the script. So how can I achieve it?
Ssong
1

votes
0

answer
48

Views

IN and NOT IN HiveQL

I am new to HiveQL and is IN and NOT IN supported in it? Especially when using Qubole? Here is my query: SELECT DISTINCT vId FROM table1 WHERE d.columnOne = '123' AND NOT d.columnTwo AND timestamp between 1523550000000 AND 1523930000000 AND NOT h.columnThree regexp '000.000.000.00|111.111.111.11|2...
noobeerp
1

votes
2

answer
123

Views

How to filter table based on percentile and then random sample in HQL?

I'm trying to random sample 200 rows from a table, but first I want to filter it to pick only top 1 percent values from a variable. I'm getting the following error - Error while compiling statement: FAILED: ParseException line 3:31 cannot recognize input near 'select' 'percentile_approx' '(' in expr...
Parth Shiras
1

votes
1

answer
36

Views

Academic Puzzle : Derive Proportions Without Self Join

We have data arriving in the following structure entity_id entity_value category_id category_weight group_id group_weight 1 100 11 6 101 4 1 100 11 6 102 3 1 100...
MatBailie
1

votes
1

answer
38

Views

Ranking for duplicate data

I have a data set below. id val q1 abc q2 abc q4 qwe q3 xyz I want to give it a number like below in SQL in HIVE. id val ranking q1 abc 1 q2 abc 1 q4 qwe 2 q3 xyz 3 The conventional functions like row_number, rank is not giving me the desired result.
1

votes
3

answer
76

Views

creating a date from an existing one in HiveQL

I'm completely new to Hive and I would really appreciate some help. I have a date column in my table and I would like to keep month and year of this date. What I would do in excel is the following: datenew= date(year(old_date),month(old_date),1) my old_date is in YYYY-MM-DD format. Thanks!!
Chrissie M.
1

votes
1

answer
363

Views

Hive: Conditionally truncate and load the table

I am trying to resolve the issue where if all categories of source table is available in target then truncate and load the target table else don't do anything. I haven't found any solution just using hive and end up using Shell script as well to resolve this issue. is it possible to avoid shell sc...
Gaurang Shah
1

votes
1

answer
577

Views

Hive - How to insert in a hive table an array of struct

So I learnt from here how to insert values into an array column: INSERT INTO table SELECT ARRAY('line1', 'line2', 'line3') as myArray FROM source1; And from here how to insert values into an struct column: INSERT INTO table SELECT NAMED_STRUCT('houseno','123','streetname','GoldStreet', 'town','Lon...
Ignacio Alorre

View additional questions