My requirement is to maintain the history of records by writing own logic in tmap.
I couldn't achieve this task and I'm trying to my best.
My workflow is as below
Oracleinput(sourcetable)---->tmap---->oracleoutput(action on table INSERT) catch lookup inner rejects false
oracleOutput(action on table(INSERT) but catch lookup inner rejects false
Source table consists of (Id,name,salary,start_date) and In target table(Id,name,salary,Start_date,End_date)
Source records looks like below
Id Name Salary Start_date
1 John 1000 24-may-2017
2 Shree 2000 24-may-2017
If there is any update in source records like salary increments from 1000 to 3000
expected output is:
Id Name Salary Start_date End_Date
1 John 1000 24-may-2017 25-may-2017
2 Shree 2000 24-may-2017 null
1 John 3000 25-may-2017 null
I have tried to maintain surrogate key in the target table to achieve this but it's failing.
Please suggest me or show me how to achieve this.
The same scenario I have implemented IN Informatica power center with Md5 concept and sequence row generator.
But In Talend, I really don't know how to implement the same
Looking forward to the possible solution!
NOTE: DON'T want to use SCD components
Thanks in advance.
First of all, you are essentially creating a slowly changing dimension, so why not use the SCD components? They will do this for you easily.
However, I can see that this might be an interesting challenge, so i can suggest the following way of achieving this. What you are essentially looking to do is find where the main row does NOT entirely match the corresponding lookup row where the ID matches. Your main row is your source data and I am assuming your lookup is also your target data. If that is the case, you need to do the following....
1) Join your main to your tMap first
2) Join your lookup to the tMap and join the main to the lookup on ID. The query for this table should have a WHERE CLAUSE that states "WHERE end_date is null" since that is the latest record. Use a left outer join (to capture main rows where the ID has never existed in the lookup table
3) Use tMap variables to compare each of the column values between the main and lookup (where a join is found). Where they all match, there is nothing to update. Where there is a change, then you need to have two outputs. 1 to a component which inserts a new record with a null end_date and 1 to a component which will update the lookup record to have an end _date of that day.
The key to use for this should be an auto generated key in the output/lookup table. When you do the join to the lookup, do not join on this key, join on the ID (which will be repeated with new versions of the record). The update of the record (where needed) will always be carried out using the unique autogenerated key field.
You will need to understand how to handle updates (this is demonstrated in the Talend documentation). This method *should* give you what you want
Here I have shown my job design ,someone please helpme to get solution.Thanks in advance
1.jobDesign: getting the max count of the surrogate key from tjavaflex and also further mapping.
2. Tmap conditions
All Matches and Inner join on the left-hand side
On right-hand side ----> toracleoutput1 -----> Insert(action on table)
----->toracleoutput2 ------>when there are changes have to insert as new record with end date as "null". Condition am using here is
"row2.ID==row3.ID && !(row2.NAME.equals(row3.NAME)) && row2.AGE != row3.AGE"
------>toracleoutput3 -----> update the end date for the existing record which has changed.
I kept oracleoutput2 (Insert as action on table)and oracleoutput3 (update as action on table)
First time when I run the job all the records from source table inserted into target table and once I update some record on source table and running the job again but there are zero rows affected.
can any one tell me What mistake I'm doing here?
Can anyone tell me where m going wrong.
i am not getting Start date and end date as per my requirement.
Why you created two pipelines for type2 logic. You can use single flow to insert new/insert changed/Update inactive records.
the flow for your requirement will be like below.
temp_dim (select max(key) from temp_dim)-- > tJavarow(to assign max value to context variable)
Depending on first flow
source(receiving latest data) --> tmap (used to lookup on temp_dim and flag records for insert,update)-- > target
@vboppudi Thanks for quick response.
Can you please show me javarow component and variable expression.
if you dont mind can you please explain once more with all component.
Watch the recorded webinar!
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.
Test drive Talend's enterprise products.