Loading Fact Table

One Star

Loading Fact Table

Below is an example of what I am trying to accomplish in Talend and am interested in knowing how it can be done:
To those familiar with Kimball data marts, the question is ? how can I look up the surrogate key from a dimension table (that uses SCD2) for loading data into the fact table.
I have a customer dimension that is populated using SCD2 for address columns.
A customer changes her address on 10/10/2007. This would mean that the currently active record (with old address) would be 'terminated' on 10/10/2007 and 1 new row (with new address) would be inserted into the customer dimension with an effective date of '10/10/2007.
Now, If I am loading transactions (into the fact table) for the month of October 2007 into the data mart. I want the transactions before 10/10/2007 to use the customer surrogate key based on the old address and the transactions after the address change to use the ?new? customer surrogate key that corresponds to the new address.
To be able to get the right customer surrogate key based on the transaction date/time, I would need a way to look up the surrogate key where
1. The business key from stream (customer_id) matches the business key in the dimension
2. The transaction date/time (from the source) is between the dimension record's effective and expire date.
So, my question was - what would be the best way to do the above surrogate key look up to make sure that appropriate surrogate key from a dimension is loaded into the fact table.
Please let me know if there is any additional information I can provide.

Re: Loading Fact Table

If I understand your point, you can use the filter of the lookup in the mapper (see capture).
Be carefull that the end date may be null (I don't manage this point in my example).
One Star

Re: Loading Fact Table

Thank you Stephane - Your technique will let me achieve what I need to.
In terms of functionality - It would be great to have a component that does such a comparison and look up by processing on the database and eliminating the need to
1. Stream all dimension rows over the network. If mapper waits for all rows to load before starting processing - it could be a major bottleneck especially if a small percentage of dimension rows are needed for loading fact table
2. The need to store them in memory for mapper. If the dimension is big - talend might write to disk - increasing latency.
One Star

Re: Loading Fact Table

I would like to know the Full Steps and Code in the Expression for the below. I am having the same scenario, but the Filter Expression is giving an error.
Seventeen Stars

Re: Loading Fact Table

You can red the dimension row by row.
In the query of the lookup component use the globalMap entries to filter the datasets.
The lookup inout component should use a separate connection to prevent the component to establish for every lookup a new connection.

What’s New for Talend Spring ’19

Join us live for a sneak peek!

Sign up now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.



Introduction to Talend Open Studio for Data Integration.


Downloads and Trials

Test drive Talend's enterprise products.