Talend Connect
Virtual Summit
JOIN US!
And visit the Customer
& Community Lounge.
May 27-28, wherever you are.

Return multiple records within cTalendJob call

Highlighted
One Star

Return multiple records within cTalendJob call

Dear community.
I'm pretty new in Talend ESB so please do not judge me much )
I' trying to build an integration framework using ESB and DI over following setup:
[list=*]
  • Event bus: RabbitMQ

  • Source and target DB: SQL Server

  • Mini-batch (15 mins) processing to reduce transaction volume and concurrency

  • Run secondary processing jobs in parallel invoking them with data-driven schematics 

  • !!! DI jobs do not allows to push messages directly into RabbitMQ therefore ESB mediation route is being used.



  • Architecture: 
    [list=*]
  • ESB mediation route (Publisher) being invoked every 15 mins by timer and reads data from Producer job and pushes into RabbitMQ Queue (cTimer->cTalendJob->cMessagingEndpoint)

  • DI Job (Producer) is reading fact of changes for CDC enabled tables over SQL server. Obtaining list of tables name and range of LSNs for each CDC entity. Afterwards should generate headers/body and push multiple messages into EventBus (tMSSQLInput->tMap->tRouteOutput)

  • Multiple ESB mediation routes (Consumers) read data from RabbitMQ and invoke different Processor jobs  (cMessagingEndpoint->cTalendJob)

  • DI Job (Processor) is reading messages from mediation route and implements transformation logic (tRouteInput->...[transformation]...->tMSSQLOutput)


  • Problem:
    [list=*]
  • cTalend returns only first record from dataset generated in Producer job.

  • How can i design mentioned dataflow to allow cTalend (in Publisher) return multiple records to push them into RabbitMQ queue?

  • Are there any ways to push data into RabbitMQ within DI job (skipping ESB)

  • Highlighted
    Ten Stars

    Re: Return multiple records within cTalendJob call

    You are thinking about the ESB as a batch system. A big difference between DI and ESB is that DI is "batch" and ESB is "real time". You can combine the two, but you will always have to treat ESB Routes as systems that deal with a single messages at a time (....well not always, but it gets A LOT more complicated).
    An easier way for you to do this (without any extra coding or writing components) is to have a batch job (DI) which processes all of your data into individual messages to be sent to RabbitMQ. Then create a web service (using ESB) which will simply send a message to RabbitMQ. In your DI job, use a tRest, tRestClient, (....whatever service component is appropriate for your web service) to call the web service you have created for every message you wish to send. 

    This way you have a single web service (which can be used for other systems as well) and a batch job which can be scheduled to run every 15 mins.
    Highlighted
    One Star

    Re: Return multiple records within cTalendJob call

    Thanks a lot.
    So to be on same page:
    [list=*]
  • cTalendJob within mediation route can "operate" with single message at one time. This is nativelly used to "send control signal messages" between application RT APIs.

  • To implement case defined above we do need additional mediator (e.g. REST or whatever webservice) to push micro-batches into RabbitMQ.

  • Another option is to use different Messaging service which is nativelly supported by DI: MSMQ, JBossMQ, ActiveMQ, WebsphereMQ, or any of JMS-based API

  • Highlighted
    Ten Stars

    Re: Return multiple records within cTalendJob call

    The scenario I described above would require the following....
    1) A normal Talend DI job to extract the data for the message queue and to call a web service to send the each message.
    2) A Talend web service (a simple REST service using the POST method would do). This would receive the messages and send them to the message queue.

    An alternative if you would like to bypass the web service call is to find a Java API for RabbitMQ and implement sending the messages in the Talend Job using a tJavaFlex and the API. 

    If it was me, I would probably opt for using the RabbitMQ API in Java since I'm very comfortable doing this. It would also perform a  bit faster I suspect. The other way will work if you are not happy with writing your Java and just want to use the Talend components.

    2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

    Talend named a Leader.

    Get your copy

    OPEN STUDIO FOR DATA INTEGRATION

    Kickstart your first data integration and ETL projects.

    Download now

    Talend API Designer – Technical Overview

    Take a look at this technical overview video of Talend API Designer

    Watch Now

    Getting Started with APIs

    Find out how to get started with APIs

    Read