Highlighted
Employee

tMap next generation, Talend community feedback welcomed

Hello Talend community members!
tMap is a "user friendly" component, but its Graphical User Interface (GUI) is getting more and more complex as time goes by.
amaumont (tMap development master) is currently adding new features (6528), we just can't stop him!
To make it short, with all the icons and listboxes foreach input/output flow, it has become a bit too complex.
Here comes a new design proposition. In this new version, please imagin:
- a tooltip on the "key grip" would give details about current properties if the property box is collapsed
- a popup opens when you click on the property box "+" icon. In this popup, you can select the option (with a complete help of course)
Thanks for your feedback
5 REPLIES
One Star

Re: tMap next generation, Talend community feedback welcomed

Few quick suggestions. I am still new so these might already be included, in that case please correct me. I think tMap is the most flexible though it does take a bit of time to open it.
1. Make tMap sort aware. So if lookup is sorted on join keys and if tMap knows it, it can do the joins more efficiently as it does not have to go to the end of the table each time.
2. Make it able to handle large data sets easier. I've had errors when I had a lookup table as big as 200K rows. I got some NetBeanLookup error (I do not remember exact error). It was almost like an Out of Memory error. I was able to get around by making the table as skinny as possible. This was when I was caching it on disk (not in-memory). This is critical.
3. A scroll bar for large tables inside will be great.
Regards,
Sean
One Star

Re: tMap next generation, Talend community feedback welcomed

Another suggestion for tMap.
Ability to directly lookup DB tables. If we have a million row table (which is not even that big), why spool all the rows to disk? Based on my observation, TOS is going to fail (at least in Java). Why not add ability to directly connect to the database table and do a lookup? Do we have this functionality in some other component?
I saw your neat trick to do block by block lookups on a table. Why not encapsulate in one component and make it obvious to find and use? Also can this trick be used if I have multiple columns to look up?
What do other TOS experts think?
Regards,
Sean
One Star

Re: tMap next generation, Talend community feedback welcomed

Hi Sean,
I think this would be a great feature. But I wouldn't restrict it to simple tables.
Whats about the following idea:
We get a new option "dynamic lookup". If you activate it you'll get a new field to add values (like in a normal lookup but you will not map the mappings from lookup into the rows). Additional there are no rows at all at this point.
After setting this value you now have a new iterate link "dynamicLookup". You could use this link like every other. The values you defined as lookup keys are stored in the globalMap. Now you could for example do a tInput and define an enhanced flow. The flow of the last component will be go back into the tMap as lookup row.
So we will have a kind of iteration inside the tMap flow. I think technical It wouldn't be so complex to do.
Bye
Volker
Employee

Re: tMap next generation, Talend community feedback welcomed

Hi guys,

it does take a bit of time to open it.

Please can you tell us :
- how much time it opens,
- how many tables
- how many and columns
- with which OS
- with which TOS version


1. Make tMap sort aware. So if lookup is sorted on join keys and if tMap knows it, it can do the joins more efficiently as it does not have to go to the end of the table each time.

In memory mode, data are hashed to be retrieved quickly.
In disk mode, data are sorted on the result of the keys expressions.
The next release 3.1.0 will allow to reload the lookup at each row, then you will be able to extract only the desired data of a database table. The 'Store on disk' mode will remain useful when files from data will have to be joined.

2. Make it able to handle large data sets easier. I've had errors when I had a lookup table as big as 200K rows. I got some NetBeanLookup error (I do not remember exact error). It was almost like an Out of Memory error. I was able to get around by making the table as skinny as possible. This was when I was caching it on disk (not in-memory). This is critical.

I advice you to change the advanced property called "Max buffer size", set it to a value around 200 000 rows by buffer, you have to set this value according different parameters. It should resolve your OutOfMemory problem definitely.

3. A scroll bar for large tables inside will be great.

Indeed it seems not possible to change the column size under Windows (it is possible to fix it), yet don't forget you can use the tab "Expression editor" at bottom or the "Expression builder" in columns.

Ability to directly lookup DB tables. If we have a million row table (which is not even that big), why spool all the rows to disk? Based on my observation, TOS is going to fail (at least in Java). Why not add ability to directly connect to the database table and do a lookup? Do we have this functionality in some other component?

As described it will be possible to do it into the next release 3.1.0 M2, and more possibilities like Volker said it using a global var in any fields as you want.
For example you will be able to reload the lookup with a different file, or change the WHERE conditions at each main (or above lookup) row.

I uploaded a screenshot which shows an example of use, the UI is not definitive.
One Star

Re: tMap next generation, Talend community feedback welcomed

Thank you for your detailed reply. Since I am still new to Talend, some of my suggestions are more from my lack of detailed understanding of talend. Based on your reply below, I'll try the 200K setting for disk buffer. I think I might have that defaulting to 2 million. The load time for tMap is not enough to actually do the count. It is just a tad more than the rest of the components which is understandable. Given that I still have 2 comments:
Comment 1:
In disk mode, data are sorted on the result of the keys expressions

My suggestion was that if the data coming in is already sorted on keys, does it need to be sorted again in this step? Won't it save time?
Comment 2:
This new lookup functionality is nice. I also use Informatica that has this component called "Unconnected Lookup" that can be called from any expression. So I can define a look up on a table or a file, say a Customer table in Oracle. Then from any expression, I can call this lookup using a function such as LOOKUP_CUST(CUST_ID) and get one column returned to me or NULL. I find this very useful as I can call it in an IF/THEN condition and it can be cached/non-cached. For large tables where one is not expecting a bunch of hits, this is very efficient.
Is what I describe above more closer to what Volker is suggesting? This might not be part of tMap but I can surely put in a request for this component, say tDBLookup and tFileLookup?
Regards,
Sean