Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,

@rdubois
I am using version 5.2.

But, the answer to your below statement "In 5.2, Talend only supports HortonWorks Data Platform with Oozie. In 5.3, we support much more distributions." is incorrect, because I have used Oozie with Apache distribution for Hadoop also.
I am even evaluating 5.3. Can you show me how to do it well with 5.2 also.

@esabot
I have this thread since long. If you are not able to find out, what problem I am facing, please let me know. I will again send you the query.
Regards,
Shouvanik
Employee

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

@rdubois
I am using version 5.2.

I mean the Hadoop distribution. Do you use HortonWorks, Cloudera, MapR ? And which version of this distribution
I am even evaluating 5.3. Can you show me how to do it well with 5.2 also.

It's not possible using the 5.2.2. You have to upgrade in 5.3.
Cheers,
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Thanks for your reply.
I am now using version 5.3. And I always get the annoying error.
Deploying job to Hadoop...
Deployment failed!
The local file can not upload to Hadoop HDFS!
java.lang.reflect.InvocationTargetException

Can you please show me the way?
Regards,
Shouvanik
Community Manager

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi Shouvanik
I mean the Hadoop distribution. Do you use HortonWorks, Cloudera, MapR ? And which version of this distribution

Have you answered Remy regarding your distribution of hadoop and the version of it?
Cheers,
Elisa
Employee

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
I agree the Oozie logs are not very explicit. You will find additional logs browsing the oozie console and then the jobtracker logs.
Oozie console: http://hostname:11000/oozie
-> Then click on your job which failed.
-> Then browse the logs of this job.
You will be able to find details here.
The error you get gives a clue even though. The local file can't be sent to HDFS. I think about a permission issue.
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
I am using version of hadoop - cdh4 and Talend Open Studio for Big data - ver 5.2.1
Regards,
Shouvanik
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
The earlier problem was access issue. The expected user was "hdfs". Job is getting deployed. Hurray!!!
Deploying job to Hadoop...
Deployment complete!
Error submitting workflow job to Oozie.

The error(new) is below.
Please check if the "Job tracker end point" and "Oozie end point" are valid!
E0901. E0901: Namenode not allowed, not in Oozies whitelist
How to resolve it?
Please help.
Regards,
Shouvanik
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

I have checked in oozie-site.xml for the value -> its showing as

oozie.service.HadoopAccessorService.nameNode.whitelist


Whitelisted job tracker for Oozie service.


No value inside. what to do.
Please help.
Regards,
Shouvanik
Community Manager

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi
Have you checked out the doucmentation (https://help.talend.com/display/TalendOpenStudioforBigDataGettingStartedGuide53EN/2.1.1+How+to+set+H...) to make sure your job tracker end point and oozie end point are properly configured?
The screenshot in the documentation link shows some syntax examples.
Elisa
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Yes. I have checked it. I am still without any clue. can you please help me?
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
I am using Talend 5.3.0. And I am able to successfully run a basic Talend Job using OOZIE. It was a problem with namenode and jobtracker port. I have now modified to correct ports.
For this, I tried core-site.xml for namenode and mapred-site.xml for jobtracker endpoints. The issue is now solved.! Thanks
Shouvanik
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
Is OOZIE applicable to big data jobs ONLY?
Regards,
Shouvanik
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
I am stuck with this
Please check if the "Job tracker end point" and "Oozie end point" are valid!
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Now, I am getting this weird error. I have a simple job of printing a message.
Deploying job to Hadoop...
Deployment complete!
Job is running ...
Job killed!
Main class , exit code
Job failed, error message, exit code ]

Please help.
I am pasting the job archieve.
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

I am not able to run a job with tMsgBox component. What can be the problem? Its very annoying and frustrating.
Deploying job to Hadoop...
Deployment complete!
Job is running ...
Job killed!
Main class , exit code
Job failed, error message, exit code ]
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

I am sorry. But I don't get reply too often to my queries. Can anyone please reply to my last post.
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
I have a flow where I fetch data from MySQL to HDFS using tSqoopImport_1 component. I can see the part files inside HDFS.
Now, I want to load data into HIVE.
I am not able to find a suitable component (Big data) using which to load data.
can you please help? I am using Talend Open Studio 5.3.0.
Regards,
Shouvanik
One Star

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
Is there any approach to load data directly from MYSQL to HIVE? I saw one component like tSqoopImport but this component only loads data into HDFS. I am using Talend v 5.3
Please help.
Regards,
Shouvanik
Employee

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Hi,
I agree the Oozie logs are not very explicit. You will find additional logs browsing the oozie console and then the jobtracker logs.
Oozie console: http://hostname:11000/oozie
-> Then click on your job which failed.
-> Then browse the logs of this job.
You will be able to find details here.
The error you get gives a clue even though. The local file can't be sent to HDFS. I think about a permission issue.

Regarding the first problem, could you please do that again. Using Oozie, all the interesting logs are on the jobtracker.
Employee

Re: Not able to schedule big-data jobs(e.g.PIG,HIVE,M/R) using OOZIE

Regarding your last question, I can help you but please create another thread: 1 question per thread, else it will be hard to follow up.

Calling Talend Open Studio Users

The first 100 community members completing the Open Studio survey win a $10 gift voucher.

Start the survey

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now