The Definitive Guide to Data Quality
I have had a go at running this in a single subjob and managed to get it to work.
I read in the fact data and immediately write all matched facts to the fact table.
Nonmatched facts are passed to a tmap that writes the new record to the dimension table that I have set to commit every 1 and insert 1 row at a time (in advanced settings). This tmap also copies the fact data to a second output flow.
The fact data passes to a third tmap that joins to the dimension table. I have set the dimension table to reload on every row, so that it will pick up the newly inserted dimension row. This tmap writes out to the fact table with the new key added.
This did work for me on very small volumes, so I can't really say that this will be more efficient than either of the methods above. I also have a reservation regarding how the DB and Talend handle the insert then the query and will it always be in synch? i.e. is there the possibility that the query will begin and complete before the dimension insert completes, thus causing a failure?
Here you go, hope this helps....