turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Home
- :
- Data Quality, Preparation and Stewardship
- :
- how to find the consecutive integers in a column?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-19-2017
07:47 PM

04-19-2017
07:47 PM

Hi all,

I have a csv file which contains a column called tokenID. The tokenID column is shown in the pic1. And I want to transform it to what it is in pic2. All the consecutive integers would be assigned the value which is the last integer in the set of consecutive integers. Does anyone know how to do the transformation in Talend or in Java code. Thanks a lot.

Solved! Go to Solution.

Labels:

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-23-2017
08:34 PM

04-23-2017
08:34 PM

My approach would be to add a column that contains a counter which only increments when the difference between the current value and the prior is more than 1. (You don't say how to handle adjacent identical values, if any, but I'll assume they can be grouped with the consecutive values.)

tMemorizeRows (https://www.talendbyexample.com/talend-tmemorizerows-component-reference.html) or tMap variables (https://www.rilhia.com/quicktips/quick-tip-compare-row-value-against-value-previous-row) can allow you to compare current values to prior values.

Done right, this should give each group of consecutive values its own key. A tAggregateRows component can find the maximum value in each group. Joining the list of max values to the original set will let you pair up each value with its group maximum.

I'm pecking this out on my phone, otherwise I'd do a proof of concept Job with screenshots and formula text, but the links above (all credit to their authors) should give you a good start. If you need help with any specifics, write back and I'll go into more detail when I'm back at my desk.

tMemorizeRows (https://www.talendbyexample.com/talend-tmemorizerows-component-reference.html) or tMap variables (https://www.rilhia.com/quicktips/quick-tip-compare-row-value-against-value-previous-row) can allow you to compare current values to prior values.

Done right, this should give each group of consecutive values its own key. A tAggregateRows component can find the maximum value in each group. Joining the list of max values to the original set will let you pair up each value with its group maximum.

I'm pecking this out on my phone, otherwise I'd do a proof of concept Job with screenshots and formula text, but the links above (all credit to their authors) should give you a good start. If you need help with any specifics, write back and I'll go into more detail when I'm back at my desk.

5 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-21-2017
09:21 PM

04-21-2017
09:21 PM

Is the list of IDs always sorted or are you looking for consecutive numbers within an otherwise unordered list?

If sorted, does the original order matter?

If sorted, does the original order matter?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-22-2017
01:58 PM

04-22-2017
01:58 PM

Yes, the list of IDs is sorted, and I do not want to change the order. Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-23-2017
08:34 PM

04-23-2017
08:34 PM

tMemorizeRows (https://www.talendbyexample.com/talend-tmemorizerows-component-reference.html) or tMap variables (https://www.rilhia.com/quicktips/quick-tip-compare-row-value-against-value-previous-row) can allow you to compare current values to prior values.

Done right, this should give each group of consecutive values its own key. A tAggregateRows component can find the maximum value in each group. Joining the list of max values to the original set will let you pair up each value with its group maximum.

I'm pecking this out on my phone, otherwise I'd do a proof of concept Job with screenshots and formula text, but the links above (all credit to their authors) should give you a good start. If you need help with any specifics, write back and I'll go into more detail when I'm back at my desk.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-23-2017
08:37 PM

04-23-2017
08:37 PM

If the list order is easy to restore by sorting it again, then I'd reverse the sort, and instead of a counter to group consecutive values, I'd use tMap variables to keep track of the current maximum value, only changing it when the difference between the current and prior values is greater than 1. This should let you get the maximum group value in one pass. Then you could sort the values back to their original order.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-24-2017
07:04 AM

04-24-2017
07:04 AM

Thank you for your great idea. I will try those components.