Lookp sort order

Highlighted
One Star

Lookp sort order

Hi.
I have a lookup join between files, done with a tMap. The join type is 1 -->N , let's say for each row of the main stream we can get N rows in the lookup file.
Here' s lookup configuration and job design:

Is the stream file order preserved in the output file ?
Is it guaranteed that the output will be in this order
record 1 (stream) record A (lookup)
record 1 (stream) record B (lookup) 
record 2 (stream) record C (lookup) 
record 2 (stream) record D (lookup)  
record 2 (stream) record E (lookup)  
or in some cases we can get this second output order (unwanted)
record 1 (stream) record A (lookup)
record 2 (stream) record C (lookup)  
record 1 (stream) record B (lookup) 
record 2 (stream) record D (lookup)  
record 2 (stream) record E (lookup)   

Thanks.
Andrea 
Highlighted
Four Stars

Re: Lookp sort order

Hi Andrea,
File read order is based on the file sorting...
tMap data lookup order is as per the data flow... 
But if you want specific order, you can insert sort component. This will ensure that the data is being processed by tMap as per your expectations.
Vaibhav
Highlighted
One Star

Re: Lookp sort order

Hi Vaibhav
I don't want to use sort component !
I need that input file order is preserved without further operations because the main stream input file is quite big (45000000 records).
So I need that the join output file is "naturally" ordered as the main stream as I wrote down:
record 1 (stream) record A (lookup)
record 1 (stream) record B (lookup) 
record 2 (stream) record C (lookup) 
record 2 (stream) record D (lookup)  
record 2 (stream) record E (lookup)   
....
record 45000000 (stream) record Z (lookup)    
record 45000000 (stream) record F (lookup)     
Thanks.
A
Highlighted
One Star

Re: Lookp sort order

tMap data lookup order is as per the data flow... 
Vaibhav

I can't understand this sentence "as per the data flow" ...
do you mean that the order is preserved as desired ?
Andrea
Highlighted
Four Stars

Re: Lookp sort order

Lookup concept is, tMap first would load all the records from the source into the memory from lookup connection. Once loaded it will take first record from main and will apply your join and deliver output records...and then go for second record...
As you have given all the matches i.e. 1-->n, then logically it should give the exact order of data in lookup and not otherwise...
In case of any discrepancy in the answer, someone from Talend / members may give correct answer.
Thanks
Vaibhav
Highlighted
One Star

Re: Lookp sort order

Thanks Vaibhav. It was my idea too
 .... I'm waiting also an "official" answer from Talend Team.
Andrea
Highlighted
Moderator

Re: Lookp sort order

Hi,
All the lookup data is loaded before(into memory or disk), and then each record in main will match all records in the lookup row by row.
Let us know if it is clear for your.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Highlighted
One Star

Re: Lookp sort order

Hi,
All the lookup data is loaded before(into memory or disk), and then each record in main will match all records in the lookup row by row.
Let us know if it is clear for your.
Best regards
Sabrina

Thanks Sabrina.
I understood what it's written, but my question is if the match order described is also the write order , or in some (unlikely) cases this order is not preserved in the output ?

Sorry if I'm pedant but I've to decide if sort or not  GBs of critical data and I'd like to decide on a software architectual base and not on 
data/user experience even if very good (many many thanks to  sanvaibhav) 

Thanks a lot.
Andrea

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog