How to stream ORACLE data to AWS S3
My company is doing POC on using Talend to load data to AWS (S3,Redshift).
I am completely new to Talend and I am looking for a possibility to stream Oracle transactional data to S3.
Could anyone advices best method (product) for doing this with Talend ( any of products currently available from Talend)?
first of all as new - install Talend and import Demo project, You will have a lot of examples (not for S3, but this is just details)
in simplest case, You will need only 4 component:
as much more complicated logic --> as much more complicated would be final Job or Project
check series of articles - Talend Best Practice (parts 1-4)
First of all thanks for your answer.
I already did some tutorials but I still consider myself as a total beginner so I will definitely follow up on those best practices you send me.
As I understand those components will build kind of pull mechanism (with delta detection using tMap) while I need data to be pushed asap change occurs in source db like in case of Oracle CDC.
Probably I can build flow with bellow logic:
In the loop read CDC components save to CSV file and and send changes to S3 but I wonder if there are better ways of doing this using Talend?
Probably I can build flow with bellow logic: In the loop read CDC components save to CSV file and and send changes to S3 but I wonder if there are better ways of doing this using Talend?
not so easy
If data only new (incremental loading) - yes, and not only by CDC
but CDC mean not only new data, but as well - UPDATES and DELETE, so logic would be more complicated