Change Data Capturing PostgreSQL and Synchronizing

Highlighted
Four Stars

Change Data Capturing PostgreSQL and Synchronizing

Hello guys,

 

my scenario is to have multiple databases with (maybe) different schema and all of the data should be synchronized to one main-database without any big latencies. I would not call it a multi-master-database-architecture since only specific tables are mutable by specific applications in their master-database but it is something into that direction. Now I have some general questions regarding the CDC capabilities of Talend and hope you can give me some answers.

 

  1. Is it true that Talend cannot subscribe to the WAL of PostgreSQL and would instead use the trigger mechanism? I read that in the documentation and kind of hope that the documentation is outdated.
    1. https://help.talend.com/reader/O~7WPF1NkGXETRxkSxA~uw/Cb8aTDjcDucIMPHrDFMUHQ
  2. The documentation states the following sentence: „When setting up a CDC environment, make sure that the database connection for CDC is on the same server with the source data to which changes are to be captured.“
    1. What does "database connection" mean in this context? Can I or can I not host Talend and PostgreSQL on different physical machines and even locations?
      1. If they can be hosted completely separately: what happens if Talend looses the connection to the source-database? The triggers will fire and no one is there to capture and the data-change-event would be lost?
      2. If Talend can make use of the WAL this should not be a problem, since the WAL would wait for the Talend-Listener to reconnect, am I correct with this assumption?
    2. What happens if Talend looses the connection to the target-database? Will Talend enqueue all changes and wait for a reconnection?
  3. Can we horizontally scale Talend so every source-database has its own Talend-Instance with the specific CDC mechanism enabled?
    1. If that is possible I don't see a problem regarding point 2.

I hope someone can answer my questions and I did not post this in the wrong section.

 

Thanks in advance and have a nice day =)

Malachi

 


Accepted Solutions
Forteen Stars

Re: Change Data Capturing PostgreSQL and Synchronizing

1. yes, it true - only trigger based replication

2. followed by 1 - triggers for collect changes must be installed on the source database server

     2.2 what happens if - nothing, it just a tables where triggers store subscribers and data, so reconnect and continue

 

there are several alternative solutions for send PostgreSQL (and not only) CDC to Kafka, then Talend could be used for parse Kafka 

 

 

-----------

All Replies
Forteen Stars

Re: Change Data Capturing PostgreSQL and Synchronizing

1. yes, it true - only trigger based replication

2. followed by 1 - triggers for collect changes must be installed on the source database server

     2.2 what happens if - nothing, it just a tables where triggers store subscribers and data, so reconnect and continue

 

there are several alternative solutions for send PostgreSQL (and not only) CDC to Kafka, then Talend could be used for parse Kafka 

 

 

-----------

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Why Companies Move to the Cloud: 7 Success Stories

Learn how and why companies are moving to the Cloud

Read Now