What kind of jobs can be scheduled in Oozie?

One Star

What kind of jobs can be scheduled in Oozie?

I have several ETL jobs that are running in Talend. Due to the amount data we would like to schedule these jobs in a Hadoop cluster. What kind of jobs can Talend schedule to Oozie/Hadoop?
Can I just schedule my current job (a combination of MS SQL inputs, joined with tMap and exported to CSV for Google BigQuery with some Java code for transforms) run in the Oozie scheduler? Or do I need to rewrite my joins in Pig Latin?
What would the best strategy be to get data from MS SQL into the Hadoop cluster using Talend?

Re: What kind of jobs can be scheduled in Oozie?

Any Talend job that uses a big data connector, HDFS, Hive, Pig should be schedulable using the Oozie tab, provided the cluster info has been filled out.
You would need to convert your joins to PigLatin or use the tPigMap component.
The SQOOP components are probably the best for writing RDBMS tables to HDFS or Hive.

Cloud Free Trial

Try Talend Cloud free for 30 days.


Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.