How to stream ORACLE data to AWS S3
My company is doing POC on using Talend to load data to AWS (S3,Redshift).
I am completely new to Talend and I am looking for a possibility to stream Oracle transactional data to S3.
Could anyone advices best method (product) for doing this with Talend ( any of products currently available from Talend)?
first of all as new - install Talend and import Demo project, You will have a lot of examples (not for S3, but this is just details)
in simplest case, You will need only 4 component:
as much more complicated logic --> as much more complicated would be final Job or Project
check series of articles - Talend Best Practice (parts 1-4)
First of all thanks for your answer.
I already did some tutorials but I still consider myself as a total beginner so I will definitely follow up on those best practices you send me.
As I understand those components will build kind of pull mechanism (with delta detection using tMap) while I need data to be pushed asap change occurs in source db like in case of Oracle CDC.
Probably I can build flow with bellow logic:
In the loop read CDC components save to CSV file and and send changes to S3 but I wonder if there are better ways of doing this using Talend?
Probably I can build flow with bellow logic: In the loop read CDC components save to CSV file and and send changes to S3 but I wonder if there are better ways of doing this using Talend?
not so easy
If data only new (incremental loading) - yes, and not only by CDC
but CDC mean not only new data, but as well - UPDATES and DELETE, so logic would be more complicated
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.