Converting Rows to columns when Number of records not constant for one ID and have millions of ID's

Highlighted
Five Stars

Converting Rows to columns when Number of records not constant for one ID and have millions of ID's

Hi,

I need to convert Rows to Columns, Number of records per ID is not constant and there are millions of ID's. I have tried TpivottoColumn delimited it will give the desired output, but if there are records more than 1 million how to do in different way. 

 

Convert Category column rows to columns and the data of type should be under the columns.

Input Data 

ID Category Type Market
1 a 100 India
1 b 200 India
1 c 300 India
2 a 400 India
2 b 500 India
2 c 600 India
2 d 700 India
2 e 800 India
3 a 900 India
3 b 1000 India
3 c 1100 India
3 d 1200 India

 

Output : 

ID a b c d e f g Market
1 100 200 300         India
2 400 500 600 700 800     India
3 900 1000 1100 1200       India

Accepted Solutions
Nine Stars

Re: Converting Rows to columns when Number of records not constant for one ID and have millions of ID's

Hi,

 

As the maximum number of columns is a manageable number, you can add in a tMap and use the centre Vars section and add in something like:

 

"a".equals(row1.Category)?row1.Type:null

 

Assign this to a Var, and then assign column A to Var.{a var name}.

 

Then sort the resulting dataset by the ID, and pass into tSortAggregatedRow, with the aggregation rule fort each a-g column as "First" and enable "ignore null values".

 

 

Regards David
Dont forget to give Kudos when an answer is helpful or mark the answer as the solution.

All Replies
Community Manager

Re: Converting Rows to columns when Number of records not constant for one ID and have millions of ID's

I may have a solution to this, but first I have a few questions.

 

1) How many columns is the max number of columns? If you do not know, you cannot do this unless you are happy to simply have columns with comma separated values.

2) Can there be more than 1 type of Market per ID and what happens if that is the case?

Five Stars

Re: Converting Rows to columns when Number of records not constant for one ID and have millions of ID's

Hi @rhall_2_0 ,

 

1. Max number of columns is 12.

2. There will be only one market type for one ID.

 

Thanks,

Viswa

Nine Stars

Re: Converting Rows to columns when Number of records not constant for one ID and have millions of ID's

Hi,

 

As the maximum number of columns is a manageable number, you can add in a tMap and use the centre Vars section and add in something like:

 

"a".equals(row1.Category)?row1.Type:null

 

Assign this to a Var, and then assign column A to Var.{a var name}.

 

Then sort the resulting dataset by the ID, and pass into tSortAggregatedRow, with the aggregation rule fort each a-g column as "First" and enable "ignore null values".

 

 

Regards David
Dont forget to give Kudos when an answer is helpful or mark the answer as the solution.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog