TOS - Help with Data transformation logic

Four Stars

TOS - Help with Data transformation logic

Hello everyone, I have to make an etl and I stuck on a logic of transformation and maybe someone can help me. Simplifying a lot now my file is structured as below:

 

ACTIVITYINOUT
A12
A23
A34
A67

The logic is that if there are 2 or more rows that are sequential ( OUT corresponding with IN ), I have to aggregate, keeping the minimum IN and the maximum OUT. So the results has to be:

ACTIVITYINOUT
A14
A67

If there were just the first 3 rows, it would be easy using a tAggregate, but I don't know how to tell Talend that has to aggregate only sequential ones. (IN and OUT in reality are dates) 

Employee

Re: TOS - Help with Data transformation logic

Basically, you should think of creating partition buckets.  Once you have your partition buckets, you just do a min on IN and a max on OUT on the bucket group and you will have the answer.

 

Here is an example using integers.  But you can use same logic for dates. I have done it with a tJavaFlex and some code as it is the fastest if your data is sorted on IN and is just like below.

 

 

1.png2.png3.png4.png

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog