TOS - Help with Data transformation logic

Four Stars

TOS - Help with Data transformation logic

Hello everyone, I have to make an etl and I stuck on a logic of transformation and maybe someone can help me. Simplifying a lot now my file is structured as below:

 

ACTIVITYINOUT
A12
A23
A34
A67

The logic is that if there are 2 or more rows that are sequential ( OUT corresponding with IN ), I have to aggregate, keeping the minimum IN and the maximum OUT. So the results has to be:

ACTIVITYINOUT
A14
A67

If there were just the first 3 rows, it would be easy using a tAggregate, but I don't know how to tell Talend that has to aggregate only sequential ones. (IN and OUT in reality are dates) 

Employee

Re: TOS - Help with Data transformation logic

Basically, you should think of creating partition buckets.  Once you have your partition buckets, you just do a min on IN and a max on OUT on the bucket group and you will have the answer.

 

Here is an example using integers.  But you can use same logic for dates. I have done it with a tJavaFlex and some code as it is the fastest if your data is sorted on IN and is just like below.

 

 

1.png2.png3.png4.png

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Agile Data lakes & Analytics

Accelerate your data lake projects with an agile approach

Watch

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch