Loading Fact Table

One Star

Loading Fact Table

Below is an example of what I am trying to accomplish in Talend and am interested in knowing how it can be done:
To those familiar with Kimball data marts, the question is ? how can I look up the surrogate key from a dimension table (that uses SCD2) for loading data into the fact table.
I have a customer dimension that is populated using SCD2 for address columns.
A customer changes her address on 10/10/2007. This would mean that the currently active record (with old address) would be 'terminated' on 10/10/2007 and 1 new row (with new address) would be inserted into the customer dimension with an effective date of '10/10/2007.
Now, If I am loading transactions (into the fact table) for the month of October 2007 into the data mart. I want the transactions before 10/10/2007 to use the customer surrogate key based on the old address and the transactions after the address change to use the ?new? customer surrogate key that corresponds to the new address.
To be able to get the right customer surrogate key based on the transaction date/time, I would need a way to look up the surrogate key where
1. The business key from stream (customer_id) matches the business key in the dimension
2. The transaction date/time (from the source) is between the dimension record's effective and expire date.
So, my question was - what would be the best way to do the above surrogate key look up to make sure that appropriate surrogate key from a dimension is loaded into the fact table.
Please let me know if there is any additional information I can provide.

Re: Loading Fact Table

If I understand your point, you can use the filter of the lookup in the mapper (see capture).
Be carefull that the end date may be null (I don't manage this point in my example).
One Star

Re: Loading Fact Table

Thank you Stephane - Your technique will let me achieve what I need to.
In terms of functionality - It would be great to have a component that does such a comparison and look up by processing on the database and eliminating the need to
1. Stream all dimension rows over the network. If mapper waits for all rows to load before starting processing - it could be a major bottleneck especially if a small percentage of dimension rows are needed for loading fact table
2. The need to store them in memory for mapper. If the dimension is big - talend might write to disk - increasing latency.
One Star

Re: Loading Fact Table

I would like to know the Full Steps and Code in the Expression for the below. I am having the same scenario, but the Filter Expression is giving an error.
Seventeen Stars

Re: Loading Fact Table

You can red the dimension row by row.
In the query of the lookup component use the globalMap entries to filter the datasets.
The lookup inout component should use a separate connection to prevent the component to establish for every lookup a new connection.


Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables


Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema


Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables