You can Build that Job, and it will create a '.bat file' and '.sh file', schedule a Job run with both '.bat file' or '.sh file' and can use to access JAR File as well.
Yes you can ... however... your job isn't optimized for Spark, therefore within the java code it has to use the spark libs to fully use the cluster processing power.
// an example
import org.apache.spark.sql.Dataset; import org.apache.spark.sql.Row;
Its not like hey... kick this job/jar to Spark and here we go! Sorry.
Talend components and code-generation need to be adjusted.
If you could use Spark components... that would make more sense. Or have a Talend Job which has some (Spark)Python code, you submit this python code to Spark...I would opt for this.
Spark is designed for distributed computing. If you want to use multithreading (infinite cpus) therefore the Talend job needs to bee designed / developed for multihreading so Spark can spin up containers/executors. However... multihreading sounds nice but doesnt always make sense keep the overhead in mind and also skew (partitioning example 70% is null within a column) .
I hope this helps.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
This video focuses on different methods of adding metadata to a job in Talend Cloud
This video will show you how to add context parameters to a job in Talend Cloud
This video will show you how to run a job in Studio and then publish that job to Talend Cloud