Compare row value against a value from the previous row

Community Manager

Compare row value against a value from the previous row

Sometimes it is useful to be able to compare a column against the same column in a previous row within a dataset. A simple example of this requirement might be as follows. You have a collection of orders that are sorted by Order Number and Date. An Order Number is usually unique, however there can be amendments made. You are tasked with identifying original Orders and Amended Orders. Original Orders will always be the first Order record created. Amendment Orders are always created after the Original Order. The table below is an example of the dataset....


Order NumberDateValue
A12321/01/2015 10:23:45£123.40
A12321/01/2015 10:34:48£145.50
A12322/01/2015 09:21:21£151.45
B23421/01/2015 11:12:32£23.45
B23421/01/2015 11:45:54£43.65


In order to identify the Amendment Orders from the Original Orders, we need to check the Order Number of the row before. If it is the same, then the current record is an Amendment Order, if it is different it is an Original Order.


We can do this in Talend quite simply using a tMap component and tMap variables. One of the most overlooked features of the tMap component is the fact that the tMap variables can store values between rows and they are processed in order (top down). This allows us to store values between rows and compare them with current rows. The screenshot below shows the layout of the tMap component. I will explain the expressions below the screenshot.





The variable expressions are shown and explained in the table below....

Variable Expression Description
order_typeVar.last_order!=null && Var.last_order.compareToIgnoreCase(row2.OrderNumber)==0
? "Amendment Order" : "Original Order" 
This variable expression is tested first. First the "last_order" variable is checked to see if it is not null and to see if it matches the current "OrderNumber". Since the "last_order" variable has not been assigned yet for the first row, it will be null. For subsequent rows it will hold the value of the previous row. If the "OrderNumber" value matches the value assigned to "last_order" in the previous row, it means it is an "Amendment Order". Otherwise it is an "Original Order".
last_orderrow2.OrderNumber The "last_order" variable is checked first and assigned second due to the order of processing the variables. 


The result of the tMap processing the data in the table at the top is shown below.....

[statistics] connecting to socket on port 4007
[statistics] connected
A123|2015-01-21T10:23:45|123.4|Original Order
A123|2015-01-21T10:34:48|145.5|Amendment Order
A123|2015-01-22T09:21:21|151.45|Amendment Order
B234|2015-01-21T11:12:32|23.45|Original Order
B234|2015-01-21T11:45:54|43.65|Amendment Order
[statistics] disconnected
Eight Stars

Re: Compare row value against a value from the previous row

Thank You, in addition to answering my basic question of how to refer a previous variable in the next variable, it also educates me on how to compare column values between current and previous rows using "compareToIgnoreCase"


Re: Compare row value against a value from the previous row

This solution is only possible in Standard jobs.

For Spark job you would need to do something different.

Check out this video:

It explains a use case where the previous record needs to be considered in Spark.

Eight Stars

Re: Compare row value against a value from the previous row

Thanks for letting us know.


Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

6 Ways to Start Utilizing Machine Learning with Amazon We Services and Talend

Look at6 ways to start utilizing Machine Learning with Amazon We Services and Talend


Downloads and Trials

Test drive Talend's enterprise products.