Change Data Capturing PostgreSQL and Synchronizing

Highlighted
Four Stars

Change Data Capturing PostgreSQL and Synchronizing

Hello guys,

 

my scenario is to have multiple databases with (maybe) different schema and all of the data should be synchronized to one main-database without any big latencies. I would not call it a multi-master-database-architecture since only specific tables are mutable by specific applications in their master-database but it is something into that direction. Now I have some general questions regarding the CDC capabilities of Talend and hope you can give me some answers.

 

  1. Is it true that Talend cannot subscribe to the WAL of PostgreSQL and would instead use the trigger mechanism? I read that in the documentation and kind of hope that the documentation is outdated.
    1. https://help.talend.com/reader/O~7WPF1NkGXETRxkSxA~uw/Cb8aTDjcDucIMPHrDFMUHQ
  2. The documentation states the following sentence: „When setting up a CDC environment, make sure that the database connection for CDC is on the same server with the source data to which changes are to be captured.“
    1. What does "database connection" mean in this context? Can I or can I not host Talend and PostgreSQL on different physical machines and even locations?
      1. If they can be hosted completely separately: what happens if Talend looses the connection to the source-database? The triggers will fire and no one is there to capture and the data-change-event would be lost?
      2. If Talend can make use of the WAL this should not be a problem, since the WAL would wait for the Talend-Listener to reconnect, am I correct with this assumption?
    2. What happens if Talend looses the connection to the target-database? Will Talend enqueue all changes and wait for a reconnection?
  3. Can we horizontally scale Talend so every source-database has its own Talend-Instance with the specific CDC mechanism enabled?
    1. If that is possible I don't see a problem regarding point 2.

I hope someone can answer my questions and I did not post this in the wrong section.

 

Thanks in advance and have a nice day =)

Malachi

 


Accepted Solutions
Forteen Stars

Re: Change Data Capturing PostgreSQL and Synchronizing

1. yes, it true - only trigger based replication

2. followed by 1 - triggers for collect changes must be installed on the source database server

     2.2 what happens if - nothing, it just a tables where triggers store subscribers and data, so reconnect and continue

 

there are several alternative solutions for send PostgreSQL (and not only) CDC to Kafka, then Talend could be used for parse Kafka 

 

 

-----------

All Replies
Forteen Stars

Re: Change Data Capturing PostgreSQL and Synchronizing

1. yes, it true - only trigger based replication

2. followed by 1 - triggers for collect changes must be installed on the source database server

     2.2 what happens if - nothing, it just a tables where triggers store subscribers and data, so reconnect and continue

 

there are several alternative solutions for send PostgreSQL (and not only) CDC to Kafka, then Talend could be used for parse Kafka 

 

 

-----------

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Download