I'm looking for creating a realtime streaming job to get data from SQL Server data base and then push it to Kafka with Talend,
I'm using a TMap to join logFile with SQLQERVER database to get the new rows updated or added but I dont know if I can loop over this subjob.
sure You can run job in loop, for example tLoop
but the way which You choose - most wrong from all possible!!!
before compare in tMap Talend must load all data from server, so Your job would read more and more and more data and regular
depending from real SQL Server loading, You could choose:
- triggers for create log table
- define created_at, updated_at columns for read new and updated data only from last iteration (+ trigger for deleted)
- You can check internet about SQL Server CDC ... or SQL Server CDC to Kafka
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.