[resolved] Weighted Average

One Star

[resolved] Weighted Average

Hey,
Do we have any component or any way to calculate weighted Average in Talend?
I am using tAggregate to compute sum, average and count. But i didn't find any function to compute Weighted Average.
I know we can do it in tMap by writing the expression but is their any easier way?
Thanks!

Accepted Solutions
Employee

Re: [resolved] Weighted Average

Hi,
I'm going to answer to your first use case scenario.
Please see the Screenshot JOB.
As you can see, I'm using tMap, tAggregate and tSort jsut to make the data sorted like you have in your OUtput sample. I haven't made any specific functions rather than using classical operation.
Let me describe the flow :
Step #1 : you have to start with the Metadata of your File; I use Integer for the DealID, and Double for the 2 other columns. (see MetadataForInput_File)
Step #2 : I'm using a tMap to add a new column named Cost which is the operation Quantity*Price (see tMap1)
Step #3 : I'm using the tAggregate component to group by DealId and sum each column Quanity and Cost (see tAggregate1)
Step #4 : then I'm using another tMap to Divide the sumCost by sumQuantity to get the Weighted average. (see tMap2)
Step #5 : I finally sort by Deal Id through the tSortRow. (see tSort)
Best regards;

All Replies
Four Stars

Re: [resolved] Weighted Average

One Star

Re: [resolved] Weighted Average

Yeah i did. But unfortunately i am not using MySql. I am using MongoDB database.
Also i wanted it to be dynamic.
One Star

Re: [resolved] Weighted Average

Any suggestion how to proceed?
Employee

Re: [resolved] Weighted Average

Hi,
I would definitely use a combination of the tAggregateRow and the tMap with the custom function to make your scenario happen.
If you could add more about your scenario such a Data Set as Input, and the expected Output; I'm sure people will provide more insights and solutions to your use case.
Best regards;
One Star

Re: [resolved] Weighted Average

hey,
I have the following scenario:
DealId Quantity Price
1 10 5
1 20 4
2 7 10
1 10 6
3 12 5
3 15 4
Now i want to find the weighted average based on dealId so the output should be like
DealId Weighted average
1 (10*5 + 20*4 + 10*6)/10+20+10 = 4.75
2 (7*10)/10 = 7
3 (12*5 + 15*4)/27 = 4.44
Thanks!
One Star

Re: [resolved] Weighted Average

Also i want to Rank them or provide sequence number based on quantity (in ascending order so lowest quantity no:1 and so on)
Basically trying to group them based on DealId and then Rank them among that subgroup based on Quantity. Also same quantity have same Rank.
So the scenario would be like this:
Input:
DealId Quantity Price
1 10 5
1 20 4
2 7 10
1 10 6
3 12 5
3 15 4
3 16 5
Output:
DealId Quantity Price Rank
1 10 5 1
1 20 4 2
2 7 10 1
1 10 6 1
3 12 5 1
3 15 4 2
3 16 5 3
Thanks for the support in advance!
Employee

Re: [resolved] Weighted Average

Hi,
I'm going to answer to your first use case scenario.
Please see the Screenshot JOB.
As you can see, I'm using tMap, tAggregate and tSort jsut to make the data sorted like you have in your OUtput sample. I haven't made any specific functions rather than using classical operation.
Let me describe the flow :
Step #1 : you have to start with the Metadata of your File; I use Integer for the DealID, and Double for the 2 other columns. (see MetadataForInput_File)
Step #2 : I'm using a tMap to add a new column named Cost which is the operation Quantity*Price (see tMap1)
Step #3 : I'm using the tAggregate component to group by DealId and sum each column Quanity and Cost (see tAggregate1)
Step #4 : then I'm using another tMap to Divide the sumCost by sumQuantity to get the Weighted average. (see tMap2)
Step #5 : I finally sort by Deal Id through the tSortRow. (see tSort)
Best regards;
One Star

Re: [resolved] Weighted Average

Thanks a lot for the help. It works fine in my use case.
Really Appreciate the help.
Looking forward for the 2nd answer.
One Star

Re: [resolved] Weighted Average

Hi,
I'm going to answer to your first use case scenario.
Please see the Screenshot JOB.
As you can see, I'm using tMap, tAggregate and tSort jsut to make the data sorted like you have in your OUtput sample. I haven't made any specific functions rather than using classical operation.
Best regards;

Hey, I appreciate the help provided.
I have a query.
This approach is a row level processing which we do in general programming.
As Talend is an ETL tool, I was looking for some column level processing (or Bulk processing).
Because when I will have huge data this processing will take some time.
Please let me know is thier any other or efficient way to achieve this.
Thanks a lot for all the support!
Employee

Re: [resolved] Weighted Average

We could definitely turn this Job as ELT/SQL Pushdown mode to scale if your data become huge.
Beside the Aggregate and Sort operations; the rest of the other operations will work just fine in the Design I did and you won't benefit any better performance if you want to take advantage of your Column level processing approach.
You alternative are :
#1 you bulk load your Input file in a RDBMS database and you perform your operation using our ELTMap capabilities.
#2 you adopt a best of breed approach; you load in Databse; the result of the first tMap1; then you perform the Aggregate and Sort in database using the ELT approach; and then you read/extract the Table perform the tMap2 operation in order to produce back your output file expected.
Best regards;