cumulative sum values group by column

One Star

cumulative sum values group by column

Hi All,
i am newbie to Talend
i am not able to find out Cumulative sum values. Any idea how to get output?
Example data is:
we have one input like
No|name|salary|Job
1|aaa|3000|Manager
2|bbb|4000|sales
3|sven|2500|marketing
4|joe|3250|Manager
5|jim|2340|sales
6|karl|1500|Manager
7|mac|3000|sales
we have another input as
Job|cum_salary
Manager|45000
sales|0
marketing|25000

output like
No|name|salary|Job|Cum_salary
1|aaa|3000|Manager|48000
4|joe|3250|Manager|51250
6|karl|1500|Manager|52750
2|bbb|4000|sales|4000
5|jim|2340|sales|6340
7|mac|3000|sales|9340
3|sven|2500|marketing|27500
Ex:
Job:Manager,
First row cum_salary is 45000 + 3000 = 48000
second row cum_salary is 48000 + 3250 = 51250
third row cum_salary is 51250 + 1500 = 52750
Job:sales,
First row cum_salary is 0 + 4000 = 4000
second row cum_salary is 4000 + 2340 = 6340
third row cum_salary is 6340 + 3000 = 9340
Job:Marketing
First row cum_salary is 25000 + 2500 = 27500
like that.......
we have cumulative salary based on Job
and like incremental loading
Can you please suggest on this.
cheers,
Raju
Seven Stars

Re: cumulative sum values group by column

I think simplest is to read your cum_salary table and store the cum_salary for each job in a globalMap variable:
tInput --> tJavaRow => globalMap.put(input_row.Job,input_row.cum_salary)
Then read your individuals table and add the salary:
tInput --> tJavaRow
Integer cum_salary = (Integer)globalMap.get(input_row.Job) + input_row.salary;
globalMap.put(input_row.Job,cum_salary);
output_row.No = input_row.No;
output_row.name = input_row.name;
output_row.salary = input_row.salary;
output_row.Job = input_row.Job;
output_row.Cum_salary= cum_salary;
Community Manager

Re: cumulative sum values group by column

Hi
Do a left outer join on tMap base on the Job column, here I create a demo job to show you how to achieve this request, please see my screenshots for details.
tFixedFlowInput_1:
1|aaa|3000|Manager
2|bbb|4000|sales
3|sven|2500|marketing
4|joe|3250|Manager
5|jim|2340|sales
6|karl|1500|Manager
7|mac|3000|sales
tFixedFlowInput_2:
Manager|45000
sales|0
marketing|25000

Cum_salary expression of out1 table on tMap:
(Integer)globalMap.get(""+row1.Job+"")==null?row1.salary+row2.Cum_salary:(Integer)globalMap.get(""+row1.Job+"")+row1.salary

// if (Integer)globalMap.get(""+row1.Job+"")==null, means it is the first row for one Job type.
Result:
Starting job jobA at 19:16 31/07/2013.

connecting to socket on port 3791
connected
.--+----+------+---------+----------.
| tLogRow_1 |
|=-+----+------+---------+---------=|
|No|name|salary|Job |Cum_salary|
|=-+----+------+---------+---------=|
|1 |aaa |3000 |Manager |48000 |
|2 |bbb |4000 |sales |4000 |
|3 |sven|2500 |marketing|27500 |
|4 |joe |3250 |Manager |51250 |
|5 |jim |2340 |sales |6340 |
|6 |karl|1500 |Manager |52750 |
|7 |mac |3000 |sales |9340 |
'--+----+------+---------+----------'
disconnected
Job jobA ended at 19:16 31/07/2013.

Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business

Re: cumulative sum values group by column

i have a requirement like
zoneid loss id(pk) month
z1 10 1 jan
z1 20 3 feb
z1 30 5 march
z2 100 2 jan
z2 200 4 feb
z2 300 6 march
now i need cumalative output like
zoneid loss id(pk) month cumalativeloss
z1 10 1 jan 10
z1 20 3 feb 30
z1 30 5 march 60
z2 100 2 jan 100
z2 200 4 feb 300
z2 300 6 march 600
Thanks,
raviteja
One Star

Re: cumulative sum values group by column

Hi,
Any update on the above give problem cause i m having the same issue as mentioned by raviteja.talend

Regards,
Ojas
Community Manager

Re: cumulative sum values group by column

Hi
@raviteja and Ojas, the tMemorizeRow component can fit your need, refer to the component manual and learn this component.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Five Stars

Re: cumulative sum values group by column

@shong

Thank you for your detailed solution!

Could you please explain this part:

 

(Integer)globalMap.get(""+row1.Job+"")