Six Stars

Big Data Spark Job - Load data into Hive

I am creating big data spark job and want to load data into dynamic partitioned hive tables.

 

Which component I can use to load data into hive and what would be the workflow?

Tags (2)
3 REPLIES
Five Stars

Re: Big Data Spark Job - Load data into Hive

tHDFSConnection--> HIVE-->tHiveRow--->tFileInputDelimiter---->tHDFSOutput

 

Hive load dataHive load data

tHiveRow configuration 

tHiveRow1tHiveRow1

 

tHiveRow2

tHiveRow2.PNG

 

tFileInput Delimiter

 

tFile.PNG

tHDFSoutput

 

hdfs.PNG

1.tHDFSConnection set your connection to your Hadoop cluster

2.set connection to hive in HIVE component

3.tHiveRow drop the table if present

4.tHiveRow create external table table

5. tFileInutDelimiter load the data to HDFS external table location

6. tHDFSoutput to load data to external table path  

Six Stars

Re: Big Data Spark Job - Load data into Hive

Which version you are using? With 6.2.1, all these tHive component is not available. Only tHiveConfiguration, tHiveInput and tHiveOutput is there.

 

Data is loaded using spark job but facing problem when the Hive table is dynamically partitioned.

 

Thanks. 

Five Stars

Re: Big Data Spark Job - Load data into Hive

I am using Talend open studio for Big data Version TOS_BD-20150508_1414-V5.6.2 . try to Download those components from https://exchange.talend.com/.