Passing Context Variables in BigData Spark Job

Six Stars

Passing Context Variables in BigData Spark Job

Hi Team

 

I have designed a Job to convert csv file to parquet format as shown in pic

1.PNG

In tfileinputdelimited i am using context variable to define file path

i am getting following error stating No input path specified

cant we parameterise BigDataSpark Job.

How this can be resolved. Do we have any other approch 

please help me out.

 

ErrorMessage

par_dir_name...HR Services Action Report 02.07.2016.csv
[ERROR]: org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand - Aborting job.
java.io.IOException: No input paths specified in job
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:201)
at org.talend.hadoop.mapred.lib.file.TDelimitedFileInputFormat.listStatus(TDelimitedFileInputFormat.java:70)
at org.talend.hadoop.mapred.lib.file.TDelimitedFileInputFormat.getSplits(TDelimitedFileInputFormat.java:96)

 

 

Thanks


Accepted Solutions
Moderator

Re: Passing Context Variables in BigData Spark Job

Hello,

Can you successfully execute your spark job when use your file path directly in tfileinputdelimited component without context variables?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

All Replies
Moderator

Re: Passing Context Variables in BigData Spark Job

Hello,

Can you successfully execute your spark job when use your file path directly in tfileinputdelimited component without context variables?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Highlighted
Five Stars

Re: Passing Context Variables in BigData Spark Job

Hi Sabrina, 

I have tried the same. Context load component does not work in a spark batch job.

the hard coded path to files and directories work.

 

Thanks

Badri Nair 

 

Four Stars

Re: Passing Context Variables in BigData Spark Job

Hello

Have you some news about this topic ? I have the same problem to iterate several files with a variable name in the tfileinputdelimited .

Thanks for the help...

Tiago
Two Stars how
Two Stars

Re: Passing Context Variables in BigData Spark Job

We have the exact same issue, I would welcome any feedback on this at all.

 

Hardcoded values in the tFileInputDelimited work fine, but the minute I want the filename to be a context variable, the job fails with a not found error.

 

Even if we have the tFileInputDelimited path as:

 

"C:\test\" + context.foldername + "\file.csv"

 

the Spark job fails with:

 

org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: C:\test\file.csv

What’s New for Talend Spring ’19

Join us live for a sneak peek!

Sign up now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads