Six Stars

Calculation based on customer id

How to get the actual value from the cummulative value in talend ? for example On a paricular a car runs for 1000 kms next day the car will start from 1001 kms ,So what i need is how to calculate the total kms covered on a particular day
1000kms on 27-10-2017 2000kms on 28-10-2017
so the total kms covered on 28-10-2017 is 1000kms how to do that in talend for which i need to update around 1 lakh history records using the above method and this needs to be done in mongodb
And there are two customers with customer id 1&2
For 2017-11-27
1 has 1000
2 has 2000
For 2017-11-28
1 has 2000
2 has 3000
How to subtract cummlative values based on the customer id so that on
2017-11-28 I will get
1 has 1000
2 has 1000
Thanks
14 REPLIES
Ten Stars

Re: Calculation based on customer id

Select the data for both days separately. Join the result sets on customer id so you have values for both days on the same row, using a tMap. Create the desired output schema in the tMap, calculating the difference between the two values as the output.
Twelve Stars

Re: Calculation based on customer id

OK, assuming that your data looks similar to this.....

.---+-----+----------.
|     tLogRow_2      |
|=--+-----+---------=|
|day|miles|customerId|
|=--+-----+---------=|
|1  |100.0|1         |
|1  |80.0 |2         |
|2  |150.0|1         |
|3  |175.0|1         |
|2  |201.0|2         |
|3  |212.0|2         |
|4  |230.0|1         |
|5  |301.0|1         |
|4  |321.0|2         |
'---+-----+----------'

....you can solve this in the following way. Here is a screenshot of a job I have knocked up quickly.....

job_layout.png

I use a tFiexedFlowInput to hard code some values. This will be your data source. I use a tSortRow to sort by CustomerID and then Day. I do most of the work in the tMap. Below you can see the tFixedFlow and tSortRow configs....

 

fixed_flow.pngSort.png

 

Now, the tMap is where the work is carried. I use a very similar mechanism to the one I showed you last time. It revolves around the ability of the tMap variables to hold the previous row's values. A screenshot can be seen below....

tmap_example.png

 

Like the other example, the tMap variables must configured EXACTLY as they are here.

customerChanged

Used to identify if the customer ID has changed. This works because the sort order is ordered by CustomerID first.

Var.lastCustomerID==null || Var.lastCustomerID!=row2.customerId


lastCustomerID
Used to remember the last customer ID

row2.customerId


todaysDistance
Used to calculate todays distance. Very similar to last example, but factors in the change in Customer ID

Var.customerChanged ? row2.miles : row2.miles - Var.lastDistance


lastDistance
The same as the previous example. Used to remeber the previous distance

row2.miles

 

Now, this can be a bit tricky to figure out, but it does work. The output from this (assuming the data above) was this.....

.----------+---+--------------.
|          tLogRow_1          |
|=---------+---+-------------=|
|customerId|day|todaysDistance|
|=---------+---+-------------=|
|1         |1  |100.0         |
|1         |2  |50.0          |
|1         |3  |25.0          |
|1         |4  |55.0          |
|1         |5  |71.0          |
|2         |1  |80.0          |
|2         |2  |121.0         |
|2         |3  |11.0          |
|2         |4  |109.0         |
'----------+---+--------------'

 

 

Rilhia Solutions
Six Stars

Re: Calculation based on customer id

HI ,

Thanks for the solution but it is failing in certain conditions sometimes the condition is failing ,Like same cummulative value is reflected in the today distance travelled .What should be done for that case

Twelve Stars

Re: Calculation based on customer id

You will need to send me a data scenario where it fails. This process should cover the scenarios where you get results daily for every customerID, if you have followed it closely. If you have days missing then you might experience some weird results, but they can be rectified by calculating how many days are missing and then averaging the driving distance across those days OR assuming the whole mileage was for a particular day. But these are just refinements.

Rilhia Solutions
Six Stars

Re: Calculation based on customer id

Hi ,

yes i am having missing dates in between how to fill those days with zero? 

 

Twelve Stars

Re: Calculation based on customer id

If you want to fill them with zero, then it probably makes sense to do that as a separate process after you have calculated the mileage between days (groups of days). It is a different problem to this one.

Rilhia Solutions
Six Stars

Re: Calculation based on customer id

How to populate missing dates with the value of previous date say for example

25-11-2017 1000

27-11-2017 2000

the missing date is 26-11-2017 how to fill it up with 25-11-2017 value using talend

Twelve Stars

Re: Calculation based on customer id

This is a different question. Can you raise this question separately as it is a good question that deserves it's own spot. People searching for this sort of solution in the future will not find it if it is answered after another question with a title that does not relate to this.

 

If the previous question was answered, can you accept the solution.

Rilhia Solutions
Six Stars

Re: Calculation based on customer id

Hi ,

 

Please find the attachment for the tmap settings as you have said.Please let me what needs to be done

Twelve Stars

Re: Calculation based on customer id

This doesn't really help. I cannot see the complete expressions and I cannot see what is wrong with the data. I *might* be able to figure something out if I could see the source data and how it is returned, but at the moment I do not even understand what the problem is.

Rilhia Solutions
Six Stars

Re: Calculation based on customer id

1. Customer_id changed ==Var.lastCustomerID==null || Var.lastCustomerID! --->customerid_changed

 

2.row2.customer_id --> lastCustomerID

 

3.Var.customerid_changed ? row2.f1_water_pump : row2.f1_water_pump - Var.last_water_produced  --> var1

 

4.row2.f1_water_pump -->last_water_produced

 

Please find the attachment for the screenshots for your reference

Twelve Stars

Re: Calculation based on customer id

These expressions look wrong. For example,

 

Customer_id changed ==Var.lastCustomerID==null || Var.lastCustomerID! --->customerid_changed

....will not even compile. 

 

Either you have not copied exactly what I supplied or a copy an paste error has occurred here.

 

Can you double check what you have sent here please?

 

 

 

Rilhia Solutions
Six Stars

Re: Calculation based on customer id

 Var.lastCustomerID==null || Var.lastCustomerID!=row2.customerId -->customerid_changed sorry i have pasted here wrongly

Twelve Stars

Re: Calculation based on customer id

OK. They look OK. 

 

I need to see the data that fails. What goes in and what goes out?

Rilhia Solutions