Capturing Changes In a Postgres Database.

One Star

Capturing Changes In a Postgres Database.

I have a operational system hosted on a Postgres Database and a Data Mart that is also hosted on a Postgres database. I need to know what support Talend provides to capture inserts/updates/deletes in the source system and to apply these changes (as needed based on the transformation rules) into the target database. At this time we need to move the data in batch once per day, this may change to real time or something similar in the future. I suspect the technique to capture changes would be different under this scenario. Questions I have about extracting the database are:
- is the source database required to do anything to support Talend
- is it possible to have one Talend job that "wakes up" at say 11pm and is able to detect the inserts/updates/deletes and move the data over to the target system
- can Talend use the postgres log file to dectect changes since the last run?
If you can provide your experience in this area that would be great, ideally I'm looking for a white paper or technical manual that I can use to create my design.
Thanks.
Community Manager

Re: Capturing Changes In a Postgres Database.

Hello
- is the source database required to do anything to support Talend

In Talend, just connect to Postgres database like you do using other custom tools.
is it possible to have one Talend job that "wakes up" at say 11pm and is able to detect the inserts/updates/deletes and move the data over to the target system

Create your scheduler on Linux or export the java script of job, then use other tool to run the job In a moment on Windows.
can Talend use the postgres log file to dectect changes since the last run?

No, Talend use tLogCatcher to catch and manage error in a job.
Best regards
shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Moderator

Re: Capturing Changes In a Postgres Database.

Hello,
Change Data Capture is available in the subscription version Talend Integration suite (http://www.talend.com/products-data-integration/talend-integration-suite.php) and will enable you to retrieve automatically new, updated or deleted rows.
CDC enable you to retrieve only changes in real-time or in batch-mode.
For more information you can contact salesATtalend.com
regards,
benjamin
One Star

Re: Capturing Changes In a Postgres Database.

Thank you for your reply, I looked into CDC and attend the webinar. It sounds exactly like what we need. I have a few questions I'm hoping you can help with.
How is the insert/update/delete transaction captured by Talend? It was not clear to me how Talend actually found out about this change only that it did. Do I have to add a trigger of some type on the database or does Talend manage capturing the change completely independant of the database.
The webinar made frequent references that this was only available on supported databases for this CDC functionality. Is Postgres one of the supported databases?
Thanks.
Moderator

Re: Capturing Changes In a Postgres Database.

Hi,
Change Data Capture is made with Talend integration Suite using light triggers (only checks tables keys) in two ways:
1- Talend Integration Suite put by himself the required light triggers on the DBMS
or
2- Talend Integration Suite generates the SQL as a script that you can pass to your DBA in order to get validation and apply.
And yes Talend Integration Suite supports CDC with Postgre.
regards,
benjamin
One Star

Re: Capturing Changes In a Postgres Database.

Thanks for your quick reply. What do the triggers do specifically - do they write to a table of some kind - does the trigger fire on each insert/update/delete applied to the table in the source system?
Thanks.
One Star

Re: Capturing Changes In a Postgres Database.

i am able to migrate table from postgress to mysql.
but how to migrate complete postgress database ( all the tables ) to mysql database
One Star

Re: Capturing Changes In a Postgres Database.

Is there more documentation surrounding CDC and what impacts it has on the source database?
Is CDC supported without affecting the source database?
Employee

Re: Capturing Changes In a Postgres Database.

Question : Thanks for your quick reply. What do the triggers do specifically - do they write to a table of some kind - does the trigger fire on each insert/update/delete applied to the table in the source system?
Answer : Talend can use another schema or database to host the information about the CDC (delta); we didn't replicate the records but only the main information to be able to consume the CDC and read the full record information transparently in the JOb using the component tCDCInput.
The Trigger can be set up to be fire for every Insert / Update or Delete, or only on Insert, only on Insert or Update; you have a checkbox options where you decide what action you want to listen from the Trigger.
The mechanism behind the scene is a pure Publisher/Subscriber mechanism; that means that you can create multiple subscribers to consume the CDC.
For instance, you have 2 applications : Financial and CRM; the CRMapplication has to consume the CDC every 10 minutes and the Financial application only every day at 1AM. That's pretty easy in Talend to create two subscribers and each one will consume the DELTA on their own frequency.