Order of two flows with the same input

One Star

Order of two flows with the same input

Hey,

I'm relatively new to Talend and working with it for my master thesis. I'm struggling with a (imho) simple task which i may have a solution for but it doesn't seem pretty.
My problem: i'm extracting some data out of 2 tables from a database and then i need to pass this data to two following (parallel) flows. So i used the tReplicate component to duplicate the output. Now i need to perform the two following tasks in a specific order because the results won't be correct otherwise.
The output of the tReplicate is used to update some records in a database table AND to insert a few new records.
My solution would be to write the output that is replicated into a file and the read it twice. First when i want to update records and then again in order to insert new records. For the right order of execution i'd use the "on Subjob OK" events.

Is this the best solution or did i oversee a better way?

thanks!

Ten Stars

Re: Order of two flows with the same input

I'm not entirely sure what you are wanting to achieve with your data, but here are some things to think about.
1) If you are updating and then inserting from an identical data set, how is the distinction between the two types action on the database made? Are you updating where there is an existing record (via primary key) and then inserting the records that do not exist? If so, you need not carry out tow passes on this. You can use the "insert or update" or "update or insert" as the action on the database. Check out the Talend docs on the db output component you are using.
2) If you are doing something that requires two passes (you can't make use of the above suggestion), maybe you can use the tHashInput and tHashOutput components. These allow you to store a data set in memory and use them in different subjobs. This would save you from writing to a file. You will need to take a look at the Talend docs (https://help.talend.com/search/all?query=tHashInput). This might mean that you no longer need the tReplicate. It is difficult to say without knowing your full job spec.

With regard to ordering the execution, you have the right idea about the use of subjobs.

Are you also aware that Talend is a code generation tool that generates Java? If you have any Java knowledge, you can use this to your advantage to learn a bit more about the product and extend its uses with third party APIs.

I have written a few tutorials that may open your eyes to some of the possibilities of use with Talend. My signature will link you to my website.
Goodluck!!
One Star

Re: Order of two flows with the same input

Thanks for your answer. I think that could help. But i have another question to your first solution.
So i have two sets. One with the primary key and two fields that will be updated per record. The second set contains new records with a primary key. I think i could unite both and use the "Insert or Update" mechanism but i'm not sure if it works. These two sets have different schemas (the insert schema has more fields than the update schema). I could expand the update schema by adding all needed fields with the former values but isn't there an easier way? Maybe just to leave them out? But i think that would overwrite those fields with default or null values.
Ten Stars

Re: Order of two flows with the same input

Go to the Advanced Settings tab of the t{database}Output component and click on "Use Field Options". You will see the fields appear in a chart with tick boxes. You can set up which fields are used for Updates, Inserts, etc. This can get a little tricky, so make sure you try a few things out and read the Talend docs on the component.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Why Companies Move to the Cloud: 7 Success Stories

Learn how and why companies are moving to the Cloud

Read Now