Spark Streaming to hive

Four Stars

Spark Streaming to hive

Hi All,

I am new to talend. I am trying to build a job which will connect to mapr stream , consume the data and write the data to a hive table. 

I am using maprstream input to consume the data which is consuming the data correctly. but when inserting the data into hive using thiveoutput component its not working. I am even not getting any error. Can anybody help here.

 

Thanks,

Ranjit.

Six Stars

Re: Spark Streaming to hive

Are you Posting to partitioned Hive table (in Append Mode) ? can you please attach the screen shots of the Job and the configurations of the tHive output ?

Four Stars

Re: Spark Streaming to hive

Hi,

No I have not selected partitioned option. And save mode is Append. I am using 6.4 Real time big data platform.

The job is a big data streaming job, with one component tMapRStreamInput Input and the other one is tHiveOutput.

I can see the messages are getting consumed using the tMapRStreamInput, but I can't insert them to hive tables.

Note: Cant attach the screenshot because of organisation security policy.

Thanks,

Ranjit.

Six Stars

Re: Spark Streaming to hive

were the Hive tables created upfront before appending the data ? I have this issue when the Hive tables weren't created upfront and tried to append the data to it. 

 

Four Stars

Re: Spark Streaming to hive

Yes, I have created the hive tables before.The problem is it doesn't give any error as well.

Highlighted
Six Stars

Re: Spark Streaming to hive

can you able to print the data to the console ? if so, i don't see the reason why it won't get inserted into a Hive table.

By the explanation the only thing that can go wrong is the Hive table schema created upfront should match the parsing schema.

Four Stars

Re: Spark Streaming to hive

Yes, I am able to print the data in console. The only thing I am wondering i there are any schema mismatch error, then it should show in the console.

Six Stars

Re: Spark Streaming to hive

schema mismatch will print the data to the console and it doesn't post the data to the hive table, make sure the schema's are defined and i would prefer in using lower case letter's. Please post if that solved the issue after trying that

Four Stars

Re: Spark Streaming to hive

do you mean the tHiveOutput component will itself print the data in case its not able to insert the data. Then its not printing form me as well. I was different component to print the data.

Six Stars

Re: Spark Streaming to hive

Printing the data to the console is through tlog row. sorry for not being clear enough

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now