I am trying to setup Real time change data capture between two different MySQL databases using Talend Studio.
I was able to successfully create a job that uses Publish/Subscribe model that picks up only the changed data from source and populates in the target database.
I could not find the documentation to setup CDC in real time i.e. as soon as a new row is inserted in the source database it will be picked up by the job and populated in target database. The Talend job will be running continuously to look for possible changes in the source.
My question: is scheduling the Talend job using some scheduler for desired interval the only option in this case? What are the options available in Talend Studio to achieve this?
Thanks in advance.
Thanks for your reply. I have designed a job for which source is the tMySqlCDC component. This component keeps track of the changes since last execution of the job. So essentialy it is Capturing the change data. What is missing in this piece is that I have to run this job for the changes to be reflected in the target database. How do I modify this job such that it continuously keeps on looking for changes in the source database i.e. once you start the job it keeps running and keeps the source and target database in sync.
Thanks once again.
You can make a cron Job in TAC and schedule to run it every 15 mins. In this way you don't have to worry about the job triggering also. once the job is triggered it will pull the change data into your space every 15 mins. This would be a near to real-time CDC. you can also change the cdc job triggering evry 5 mins depending on the average job completion time.
Talend CDC is not real time
You can look for:
You can push data from CDC to Kafka, and than parse Kafka topic with Talend, this is work
All based on native replication protocol and work without overloading of server by triggers