Lookp sort order

One Star

Lookp sort order

Hi.
I have a lookup join between files, done with a tMap. The join type is 1 -->N , let's say for each row of the main stream we can get N rows in the lookup file.
Here' s lookup configuration and job design:

Is the stream file order preserved in the output file ?
Is it guaranteed that the output will be in this order
record 1 (stream) record A (lookup)
record 1 (stream) record B (lookup) 
record 2 (stream) record C (lookup) 
record 2 (stream) record D (lookup)  
record 2 (stream) record E (lookup)  
or in some cases we can get this second output order (unwanted)
record 1 (stream) record A (lookup)
record 2 (stream) record C (lookup)  
record 1 (stream) record B (lookup) 
record 2 (stream) record D (lookup)  
record 2 (stream) record E (lookup)   

Thanks.
Andrea 
Four Stars

Re: Lookp sort order

Hi Andrea,
File read order is based on the file sorting...
tMap data lookup order is as per the data flow... 
But if you want specific order, you can insert sort component. This will ensure that the data is being processed by tMap as per your expectations.
Vaibhav
One Star

Re: Lookp sort order

Hi Vaibhav
I don't want to use sort component !
I need that input file order is preserved without further operations because the main stream input file is quite big (45000000 records).
So I need that the join output file is "naturally" ordered as the main stream as I wrote down:
record 1 (stream) record A (lookup)
record 1 (stream) record B (lookup) 
record 2 (stream) record C (lookup) 
record 2 (stream) record D (lookup)  
record 2 (stream) record E (lookup)   
....
record 45000000 (stream) record Z (lookup)    
record 45000000 (stream) record F (lookup)     
Thanks.
A
One Star

Re: Lookp sort order

tMap data lookup order is as per the data flow... 
Vaibhav

I can't understand this sentence "as per the data flow" ...
do you mean that the order is preserved as desired ?
Andrea
Four Stars

Re: Lookp sort order

Lookup concept is, tMap first would load all the records from the source into the memory from lookup connection. Once loaded it will take first record from main and will apply your join and deliver output records...and then go for second record...
As you have given all the matches i.e. 1-->n, then logically it should give the exact order of data in lookup and not otherwise...
In case of any discrepancy in the answer, someone from Talend / members may give correct answer.
Thanks
Vaibhav
One Star

Re: Lookp sort order

Thanks Vaibhav. It was my idea too
 .... I'm waiting also an "official" answer from Talend Team.
Andrea
Moderator

Re: Lookp sort order

Hi,
All the lookup data is loaded before(into memory or disk), and then each record in main will match all records in the lookup row by row.
Let us know if it is clear for your.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Lookp sort order

Hi,
All the lookup data is loaded before(into memory or disk), and then each record in main will match all records in the lookup row by row.
Let us know if it is clear for your.
Best regards
Sabrina

Thanks Sabrina.
I understood what it's written, but my question is if the match order described is also the write order , or in some (unlikely) cases this order is not preserved in the output ?

Sorry if I'm pedant but I've to decide if sort or not  GBs of critical data and I'd like to decide on a software architectual base and not on 
data/user experience even if very good (many many thanks to  sanvaibhav) 

Thanks a lot.
Andrea