I've encountered a very simple issue for you guys.
I need to read data from a lookup table which contains for the moment 2milions + rows.
My SQL Workbench takes 77 secs to display it all but Talend is not working, it does the select query but takes forever to end.
So, how can I resolved this issue ?
PS : I'd prefer a solution entirely based on Talend components and a simple one, doesn't matter if it's not the most performant solution.
Thanks for reading.
Millions rows for the lookup, but how many for the main row?
Which lookup model are you using (load once or reload at each row)?
As explain in the documentation, "when the main flow has much less rows than the lookup flow (for example, with a ratio of 1000 or more) and the lookup input is a database component" you should consider the option to reload the lookup table at each row giving the opportunity to "select only the lookup data that is relevant for each record in the main flow".
You should have a look at this page.
Yes thank you, it did help me because I don't know much things about Talend. However, I don't think this is what I want because it increase the time for the job to end. And to answer your previous question my job as much more rows at the end of all tMaps combination.
@Goubi of course this design is not the way to improve performance but it did not seem like your first concern as you said "doesn't matter if it's not the most performant solution".
Note: if you want better response time, run the query on the DB Server.
How long it take to run the query associated to the lookup table?
How many rows in the main table?
Which lookup model are you using?
Did you try the solution proposed here?
And finally, it is a common situation to have different guys working with the DB solution and with Talend.
Usually, for complex or particular situations, DB experts write ad-hoc queries exposed as views or SQL functions which are used in Talend jobs.
This way, you have the opportunity to tune the queries without any change on jobs side (as soon as the selected columns do not change).
If your lookup table have primary key or indices you are able to use in your query (based on the solution you have to test), it could be the solution but you have to know that you'll run the query on the lookup table for each row in the main table.
Beware of reuse an existing connection for the lookup table (so you maybe have to open 2 connections).
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.