How to stream ORACLE data to AWS S3 ?

Four Stars

How to stream ORACLE data to AWS S3 ?

How to stream ORACLE data to AWS S3

 

Hi All,

 

My company is doing POC on using Talend to load data to AWS (S3,Redshift).

I am completely new to Talend  and I am looking for a possibility to stream Oracle transactional data to S3.

 

Could anyone advices best method (product) for doing this with Talend ( any of products currently available from Talend)?

 

Regards,

 

Wojtek

Twelve Stars

Re: How to stream ORACLE data to AWS S3

easy

 

first of all as new - install Talend and import Demo project, You will have a lot of examples (not for S3, but this is just details)

 

in simplest case, You will need only 4 component:

- tOracleInput

- tMap

- tCSVOutputDelimited

- tS3Put

as much more complicated logic --> as much more complicated would be final Job or Project

 

check series of articles - Talend Best Practice (parts 1-4)
https://www.talend.com/blog/2017/05/05/data-model-design-best-practices-part-1/

-----------
Four Stars

Re: How to stream ORACLE data to AWS S3 ?

 

Hi Vapukov,

 

First of all thanks for your answer.

I already did some tutorials but I still consider myself as a total beginner so I will definitely follow up on those best practices you send me.

 

As I understand those components will build kind of pull mechanism (with delta detection using tMap) while I need data to be pushed asap change occurs in source db like in case of Oracle CDC.

 

Probably I can build flow with bellow logic:

In the loop read CDC components save to CSV file and and send changes to S3 but I wonder if there are better ways of doing this using Talend?

 

Regards, 

 

Wojtek

Twelve Stars

Re: How to stream ORACLE data to AWS S3 ?

Probably I can build flow with bellow logic:
In the loop read CDC components save to CSV file and and send changes to S3 but I wonder if there are better ways of doing this using Talend?

not so easy

If data only new (incremental loading) - yes, and not only by CDC

 

but CDC mean not only new data, but as well - UPDATES and DELETE, so logic would be more complicated

-----------