Convert Pyspark ETL job to Talend ETL job

Four Stars

Convert Pyspark ETL job to Talend ETL job

Hi,

 

Is it feasible to convert custom Pyspark jobs to Talend native jobs?

 

I have a  requirement to integrate existing Pyspark jobs with Talend.  Is there a way I can trigger Pyspark jobs from Talend? If yes , then what should be the approach? 

 

 

Nine Stars

Re: Convert Pyspark ETL job to Talend ETL job

Do you mean Spark jobs you design using talend?

Sorry, I don't understand
Regards
DGM
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Employee

Re: Convert Pyspark ETL job to Talend ETL job

Hi,

 

    There are no Talend Studio components which can directly invoke PySpark. But the other way to do the process is by using tSSH or tSystem components, where you can invoke the commands through Shell mode.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

Four Stars

Re: Convert Pyspark ETL job to Talend ETL job

This is regarding migration of pyspark jobs to Talend. In other words, can we call pyspark jobs through talend.

We have a requirement where we have 300 pyspark jobs and we want to call them through talend.

Employee

Re: Convert Pyspark ETL job to Talend ETL job

@ShikhaSharma 

 

There are no direct components in Talend to call PySpark. But once your provide the necessary execution privileges, you should be able to call the PySpark through command line.

 

https://spark.apache.org/docs/0.9.0/python-programming-guide.html

 

 This feature can be completed through the terminal components of Talend like tSSH.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Talend Cloud Available on Microsoft Azure

An integration platform-as-a-serviceto help enterprises collect, govern, transform, and share data from any data sources

Watch Now

Self-service Talend Migration: Moving from On-Premises to the Cloud

Move from On-Premises to the Cloud by following the advice of experts

Read Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now