I cannot switch to a JAVA project because perl is a requirement for my project.
I understand your approach but unfortunately it doesn't work very well because:
- in your example, the data is read from cvs files. It's then simple to replicate the input.
- in my case, the data is the result of several other transformations, mappings.
If I want to apply your approach, I would have to duplicate my transformation.
The final result would then be very complex and not very nice any more.
Ok. I could fix it by duplicating the tMySqlOutput component and using "update or insert" for "action on data".
The first branch is responsible of denormalizing the entries (CONCAT).
The second branch is responsible of aggregating the entries (COUNT).
At the end, the second branch updates the entries previously inserted by the first branch.
I just fear that collisions could happen. That means that both branches will insert new entries (rather than updating) if they are executed concurrently. To avoid that, I added a tSleep component. But I'm not sure that it works in any case.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Part 2 of a series on Context Variables
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema