Passing Context Variables in BigData Spark Job

Six Stars

Passing Context Variables in BigData Spark Job

Hi Team

 

I have designed a Job to convert csv file to parquet format as shown in pic

1.PNG

In tfileinputdelimited i am using context variable to define file path

i am getting following error stating No input path specified

cant we parameterise BigDataSpark Job.

How this can be resolved. Do we have any other approch 

please help me out.

 

ErrorMessage

par_dir_name...HR Services Action Report 02.07.2016.csv
[ERROR]: org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand - Aborting job.
java.io.IOException: No input paths specified in job
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:201)
at org.talend.hadoop.mapred.lib.file.TDelimitedFileInputFormat.listStatus(TDelimitedFileInputFormat.java:70)
at org.talend.hadoop.mapred.lib.file.TDelimitedFileInputFormat.getSplits(TDelimitedFileInputFormat.java:96)

 

 

Thanks


Accepted Solutions
Moderator

Re: Passing Context Variables in BigData Spark Job

Hello,

Can you successfully execute your spark job when use your file path directly in tfileinputdelimited component without context variables?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

View solution in original post


All Replies
Moderator

Re: Passing Context Variables in BigData Spark Job

Hello,

Can you successfully execute your spark job when use your file path directly in tfileinputdelimited component without context variables?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

View solution in original post

Five Stars

Re: Passing Context Variables in BigData Spark Job

Hi Sabrina, 

I have tried the same. Context load component does not work in a spark batch job.

the hard coded path to files and directories work.

 

Thanks

Badri Nair 

 

Four Stars

Re: Passing Context Variables in BigData Spark Job

Hello

Have you some news about this topic ? I have the same problem to iterate several files with a variable name in the tfileinputdelimited .

Thanks for the help...

Tiago
Two Stars how
Two Stars

Re: Passing Context Variables in BigData Spark Job

We have the exact same issue, I would welcome any feedback on this at all.

 

Hardcoded values in the tFileInputDelimited work fine, but the minute I want the filename to be a context variable, the job fails with a not found error.

 

Even if we have the tFileInputDelimited path as:

 

"C:\test\" + context.foldername + "\file.csv"

 

the Spark job fails with:

 

org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: C:\test\file.csv

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog