One Star

[resolved] How to work on Accumulating Snapshot Fact table in TIS

My Fact table rely on a SCD Type 2 Dimension table. And my Fact table is in accumulating snapshot model.
Please let me know how to update the fact table in this situation. (Update existing row and insert a new row) - similar to SCD Type 2.
Is there a component for that?
I am using Jaspersoft ETL 5.4.1 Ent version
9 REPLIES
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

Hello bkar81 - depending on database you're using, one clean and easy way to accomplish this would be a MERGE statement in a tDBRow (tOracleRow for Oracle etc...). Here's an example of what the statement would look like - http://psoug.org/reference/merge.html . With the MERGE, you can easily define how to insert or update rows.
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

Thanks for the suggestion Willm.
My fact table has lot of foreign keys pointing to dimension tables and most of them are of SCD Type 2. In this case how can I achieve it?
Unfortunately, this is the requirement
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

Let me take a step back and describe how your job could be set up to do this purely in Talend.
+ First update all your dimension tables with data from your source system (SCD1, SCD2 etc...)
+ Select data from your source system that includes all your business keys (so every row has business keys and measures (for the fact table)
+ Using tMap, look up the current row from each Dimension table; since they're SCD2, you'll have an active flag that shows what row is the current.
+ From the row for each dimension table, get the corresponding Surrogate Key; for example, CustID MAG7983 could be mapped to Surrogate Key 1123. Get that key and add to your output.
+ In that tMap, look up all your Dimensions, and make sure you use 'Left Join' to that main data flow and select 'Unique' records from the input 'table'.
+ Coming out of the tMap, you'll have all your Surrogate Keys and your measures, ready to be inserted / updated into the Fact Table.
+ Write this to a Staging Table or output file.
See screenshots... You will notice that the initial load of the Staging table uses a Process_Type field, which we set to 1. This will be the default for all new records from the source system.
Now when you have the data in the staging table, you need to determine what records in the Fact Table need to be updated. We do this by joining the Staging table to the Fact Table (identical schemas at this point) and using an Inner Join. We only join on as few keys as we expect to always have the first load. In this case, I don't expect us to have ShipDate, Receipt Date etc... So we don't join on them. Any row that matches completely based on SK matches is assigned a Process_Type of 2. The output from this step is used to update the same Staging Table. At this point, rows that should be used for update will have Process_Type 2, and those for Insert, Process_Type 1.
Finally, read the Staging table and update or insert (see screenshots for details).
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

Wow... What a detailed explanation... Lemme try and get back to you
Thanks a lot
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

Hi,
I have a small doubt..
Can I implement this through tMysqlSCD component instead of manually setting a flag to 1 or 2 (I know SCD is meant for Dimensions and not for fact)? I want to know which one would be faster and error free...
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

You can certainly try... The issue with that approach is that you'd be constrained to the way the SCD component works - you have to define SCD Start Date, SCD End Date, Active flag, plus a new Surrogate key. These would all be unnecessary and complicated for a Fact table - plus, you'd lose control on what fields you want to check on and how exactly you want to check...
The only twist to the approach I outlined here is having to check whether a fact had previously been inserted into the fact table. Otherwise, we normally just insert into fact tables.
Let us know how it goes...
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

And as far as faster and error free, I've used this simple design pattern (1 or 2...) many times across different projects. But don't take my word for it Smiley Happy. Put it to the test and let us know...
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

Yep sure...
I need to design both and do a load test before commenting...
Thanks anyways.
One Star

Re: [resolved] How to work on Accumulating Snapshot Fact table in TIS

Is there a way to calculate the checksum inside talend (means either with tMap or any specific component), if so we can easily use the checksum field in tMap to determine the existence of a record in the Fact table