Four Stars

Normalization on multiple input columns having additional filters and transformations.

It’s about Normalization. We need to perform normalization on multiple input columns. Below is the Scenario:

Suppose we have below fields:

 

Input_Field_Name

Output_Field_Name

MID

Additional_MID

Name

Additional_Name

Item_Code1

Additional_item_no

Item _Desc1

Additional_item_dsc

Item _Code2

 

Item _Desc2

 

 

We want to do the normalization on  field names like "Item_Code" with having some additional filter conditions on field values, as mentioned below:

 

If field_name like " Item_Code" and field_value!=1

then

output_field.Additional_MID = input_field.MID;

output_field.Additional_Name = Substring (input_field.Name,1,3)

output_field.Additional_item_no = field_name;

output_field.Additional_item_dsc = ("Item_Desc" + field_name.replaceAll("\\D", "")));

 

In this case, if we have one input row  then we must get 2 rows at output (considering all conditions are satisfied for normalization). Please find below example:

 

Input row:

MID|Name|Item_Code1|Item _Desc1|Item_Code2|Item _Desc2

123|ABCD|1001|Sale|1002|Perches

 

Expected Output should be:

Additional_MID|Additional_Name|Additional_item_no|Additional_item_dsc

123|ABC|Item_Code1| Item_Desc1

123|ABC|Item_Code2| Item_Desc2

 

Could you please let us know the Talend component to implement above scenario?

2 REPLIES
Four Stars

Re: Normalization on multiple input columns having additional filters and transformations.

Hi,

 

We have tNormalize component in talend,but you have to design the talend job appropriately to achieve your requirement.

 

Thanks,

Premchand

Seven Stars

Re: Normalization on multiple input columns having additional filters and transformations.

Hi,

 

i have created a flow with your data. Please check this.

Field logic in Tmap:

item_no:

row1.id.equals("1") ?  row3.Item_Code1 :  row3.Item_Code2 

 

item_dsc:

row1.id.equals("2") ?  row3.Item_Desc1 : row3.Item_Desc2 

 

Below screenshots will let you know the job flow.

 

Note: Use your source data as lookup and and dummy data(eg: tFixedflowInput2) as main flow.

 

1.JPG2.JPG3.JPG4.JPG

 

Out put:

 

Starting job test at 18:48 21/04/2017.

[statistics] connecting to socket on port 4069
[statistics] connected
.---+----+-------+--------.
| tLogRow_1 |
|=--+----+-------+-------=|
|MID|Name|item_no|item_dsc|
|=--+----+-------+-------=|
|123|ABCD|1001 |Perches |
|123|ABCD|1002 |Sale |
'---+----+-------+--------'

[statistics] disconnected
Job test ended at 18:48 21/04/2017. [exit code=0]

 

Hope you understand the solution. Post back if you have any questions.

 

Thanks,
Sid
Please like the post if it is useful
Please put to resolved if it resolves your issue.

Thanks,
Sid
Mark as solution if this resolved your issue