I'm trying to load a table (UNIQUE) from an another table (TEMP) by filtering and processing my datas.
In the source table (TEMP), I have 24 million rows without any ID or INDEX.
I expect 10 million out.
Off with my job I load only 4.8 million rows. (But in the job stats I saw that I have processed a total of about 10 millions rows)
I have the impression that the use of treplicate and loading on the same table (UNIQUE) is problematic.
Do I have to split my treatment in half? (Because it is very long to load the first table which has no any ID or INDEX)
Are you sure that your 2 tFilterRow criteria are mutually exclusive?
It might be better to remove the replicate and one of the tFilter rows, then feed the rejections from one of the tFilterRows into the tMap instead.
Yes, in the stats I have what I expect (10 millions for NHIBID !=1 and 4475 for NHIBID=-1), but in my DB, I have the 4475 expected rows, but for the 10 millions expected rows there's only 4,8 millions.
I will try your solution, by using reject and buffer with on subjobOk to be sure.
I don't know if it's possible to insert 2 datas flows at same time in one table.
Given the large deviation in size between the 2 datasets, it might be better to write away the 10 million rows directly, and use a hashoutput/input to do the 4475.
Make sure you commit each output seperately.
The first 100 community members completing the Open Studio survey win a $10 gift voucher.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Pick up some tips and tricks with Context Variables
Learn how media organizations have achieved success with Data Integration
Test drive Talend's enterprise products.