How to implement Incremental Load logic on the Fact table

Highlighted
Six Stars

How to implement Incremental Load logic on the Fact table

Hello All,

 

I have a fact table that is being loaded every day in full load manner meaning truncate and reload every day. (5 million records every day)

I have like 14 look up dimension tables which are loading the end fact table.

 

Now the client asked me to implement incremental logic on this fact table. I was asked not to use the CDC component as  the client said they can not make any changes to source tables or on source database because of their policies.

 

Can some one please help me how to implement this logic as I have never implemented incremental logic on a fact table.

 

Below is how my job looks like and FCT_SN_FINAID_OPERATIONS id my final table that is being loaded.

 

Capture.PNG

Thank You


Accepted Solutions
Employee

Re: How to implement Incremental Load logic on the Fact table

Hi,

 

    When you are doing incremental logic, you need to check whether the table is already having the data based on key columns. If the data is present, pick the keys from the records and any change is an update to this fact table. If the data is not present, you can do straight insert.

 

     Now your query might be how to differentiate the records to insert and update based on key columns. It can be done with following steps.

 

a) Do an inner join between your stage table (where you will have full data from source) and fact table based on key columns. This will give you the records which are already present. Now use this dataset as a lookup for a flow where you need to read the input stage table as the main flow again. 

 

b) Do the inner join with both datasets using inner join. But in the output section, select the option to pick only inner join reject records. This means we are picking only those records which are not present in the DB. These records can be send directly for insert.

image.png

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)


All Replies
Employee

Re: How to implement Incremental Load logic on the Fact table

Hi,

 

    When you are doing incremental logic, you need to check whether the table is already having the data based on key columns. If the data is present, pick the keys from the records and any change is an update to this fact table. If the data is not present, you can do straight insert.

 

     Now your query might be how to differentiate the records to insert and update based on key columns. It can be done with following steps.

 

a) Do an inner join between your stage table (where you will have full data from source) and fact table based on key columns. This will give you the records which are already present. Now use this dataset as a lookup for a flow where you need to read the input stage table as the main flow again. 

 

b) Do the inner join with both datasets using inner join. But in the output section, select the option to pick only inner join reject records. This means we are picking only those records which are not present in the DB. These records can be send directly for insert.

image.png

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

Six Stars

Re: How to implement Incremental Load logic on the Fact table

Thank You very much. I was able to implement what I needed based on your inputs.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog