One Star

How do I 'unpivot' a table?

I'm trying to figure out how to implement this. I'm new to Talend and evaluating it for an upcoming project. The project has a bunch of legacy tables that will need to be unpivoted.
So, for example, I'd like to start with a table like this:
TrxId Name Amt1 Amt2 Amt3
1 Car 10 45 50
2 Truck 30 60 15
And end up with a table like this:
TrxId Name AmountType Amount
1 Car Amt1 10
1 Car Amt2 45
1 Car Amt3 50
2 Truck Amt1 30
2 Truck Amt2 60
2 Truck Amt3 15
Note that TrxId and Name are 'passed through' while the Amt1,2,3 columns are pivoted on a new column AmountType.
I'm not sure if I'm using the exact terminology here. Any help would be greatly appreciated.
Thanks.

  • Data Integration
16 REPLIES
One Star

Re: How do I 'unpivot' a table?

If this doesn't exist I'd help make it a component.
The pseudo looks like:
Let the user select the columns to be unpivoted.
Let the user specify a name for the . (AmountType in above example).
Let the user sepcify a name for the . (Amount in the above example).
For Each inputRow
{
For Each ColumnToUnpivot
{
Create a new OutputRow (schema: input - ColumnsToUnpivot + 2 new columns )
Pass values for columns not in ColumnsToUnpivot from inputRow through to outputRow.
Set the value of 'TypeColumn' to ColumnToUnpivot's name.
Set the value of 'ValueColumn' to ColumnToUnpivot's value.
}
}
Can someone help me get from A -> B on this one? Like I said I'm new to Talend but I don't imagine this would be that hard.
Community Manager

Re: How do I 'unpivot' a table?

Hello
Here comes a Java scenario, please see the screenshots.
Input file:

TrxId;Name;Amt1;Amt2;Amt3
1;Car;10;45;50
2;Truck;30;60;15
3;Shong;10;20;30

Result:
Starting job topic_4500 at 11:14 24/10/2008.
1|Car|Amt1|10
1|Car|Amt2|45
1|Car|Amt3|50
2|Truck|Amt1|30
2|Truck|Amt2|60
2|Truck|Amt3|15
3|Shong|Amt1|10
3|Shong|Amt2|20
3|Shong|Amt3|30
Job topic_4500 ended at 11:14 24/10/2008.

Feel free to post your question!
Best regards
shong
----------------------------------------------------------
Talend | Data Agility for Modern Business

Re: How do I 'unpivot' a table?

thanks shong,
i was looking for this.
i have a post here: 4492
in my case, i have multiple keys,; so i will try your technique and post back.
however, would it not be more future prove to have a component to handle this type of scenario.
i for one, have plenty of flat-files with similar structures.
One Star

Re: How do I 'unpivot' a table?

Thank you for your answer shong. Sorry it took me a few days to get back.
I have a few more questions:
-Can you explain the purpose of file_inputdelimited2?
-Is there anyway to add multiple rows to an output without using an iterate flow? I really want to make this a component and I believe this would be required. Is this just a limitation of TOS?
Thanks again.
Community Manager

Re: How do I 'unpivot' a table?

Hello
Can you explain the purpose of file_inputdelimited2?

Generally, if there is a output component in a job, it must has a input component, so inputdelimited2 is a input component here and it should only contain one row, because we just want to output the assigned value one time for each value of tForEach. You will see the difference and know why if you write two rows or more than in inputdelimited2.
Is there anyway to add multiple rows to an output without using an iterate flow? I really want to make this a component and I believe this would be required. Is this just a limitation of TOS?

No, there isn't a component can do that.
Best regards

shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: How do I 'unpivot' a table?

I think this will be very useful as a component. I use Informatica quite a bit and this will be almost like the "normalizer" component the Power Center has. Please do consider it. I can put the request in if no one has done it so far.
Thanks.
Sean
One Star

Re: How do I 'unpivot' a table?

Just to throw this out there: SSIS also has a similar transform (called 'unpivot'). Not being able to do this cleaning is making it a bit difficult to migrate to TOS.
TOS, in general, is great though.
Keep up the good work everyone,
Dane
One Star

Re: How do I 'unpivot' a table?

I think TOS is one of the best ETL tools. Obviously, Informatica has been around for a very long time and enterprises pay $100K just for connectors. I use both though I am more familiar with Informatica based on number of years I have been using it.
TOS has a lot of flexibility built in and wealth of connectors. But there are still few things that I feel we need to bring over from Informatica. One is the Normalizer (unpivot). Another is Router that can send rows to multiple targets with very little coding.
Sean
Employee

Re: How do I 'unpivot' a table?

1) We DO have a tNormalize. It's available in the Processing family.
BTW This component is actually generating many rows from one row.
Here is the result that can be easily generated with your input and this component
1|Car|10
1|Car|45
1|Car|50
2|Truck|30
2|Truck|60
2|Truck|15
So this component is probably missing a little option to process your data exactly like you want.
That's why, shong gave you a more complicate example.
2) When you have specific needs, you candevelop your own components.
This is noty very easy but this can be done if you take the time to think about it.
The main requirement to write a component, is to really understand Java. This is not the case of mose ETL users.
3) That's also why we have a bugtracker where you can ask for new features.
In this place you can directly interfer on Talend's roadmap with your requests.
Maybe in some cases, we don't do it immediately, but in most of the cases, we try to do our best and you see your new feature less than 4 month after in the main product.
If you have urgent need we also have solutions (but I'm sorry this will cost a few money)
Sean, I definetely believe that you will never have such a public place with any proprietary editor. Even when you pay $100K just for the connectors...
Regards,
One Star

Re: How do I 'unpivot' a table?

mhirt,
I hope you did not misunderstand my note as a criticism. I use Informatica because the company I consult with (a Fortune 500 shop), that is the only official tool. For my other clients, I use ToS which I find is very flexible. I think Informatica is a white elephant. In fact asking for $100K for a connector is a joke (I think it was Lotus Dominos connector). Here we can easily write such a connector. So you guys rock.
I did miss the tNormalize component as I was not dealing with such an issue at this time. That is my oversight and given shong's explanation, I thought this was not possible to do. I'll definitely try it out.
I have also requested features using bug tracker and actually gotten them added very fast (thanks to plegall).
I am trying to read up and get a better understanding of Java as that will give me more flexibility with this tool and enable me to write new connectors/transforms. My limited expertise with Java is something of a limiting factor.
Regards,
Sean
One Star

Re: How do I 'unpivot' a table?

mhirt,
As with psm2000, by no means is my suggestion a criticism. I think TOS is great and the public exchange is fantastic. I agree, I won't be able to get this flexibility and openness with a proprietary system. However, for this project a proprietary system, even with its limitations, might be the best bet.
tNormalize is missing a lot more than a little option to format the data. It doesn't bring the column names with it, which is a key part of an 'unpiviot' component. From where I sit, I'm not sure if this is possible with the current core. The work around provided, is just that, a work around. This project requires a lot of these transformations and that's certainly not an easy/clean way to do it.
My original posts were asking 2 things: is there an existing way to do this? If not, how do I go about building one? I'm a developer and I'd be interested in contributing. I'm really just looking for an 'in theory you can do this' go-ahead before I spend time trying to learn the inner workings of the framework and build this myself.
So, with your understanding, which is far greater than mine, do you think I can use tNormalize as a template for an 'unpivot' component? Or is something going to stop me from finishing the job? Is it possible to bring the column names with it?
From what I understand now, this feature can't be built using the 'component' extensibility point. If this is not the case, could someone let me know, in pseudo, what steps I need to take?
Thanks,
Dane
Employee

Re: How do I 'unpivot' a table?

Hello Sean and Dane
Don't misunderstand me . I like when the product is criticized. I strongly believe that this is is the goal of an Open Source product.
tNormalize is missing a lot more than a little option to format the data. It doesn't bring the column names with it, which is a key part of an 'unpiviot' component. From where I sit, I'm not sure if this is possible with the current core.

This can be done with the existing core and this is not really difficult to implement.
So, with your understanding, which is far greater than mine, do you think I can use tNormalize as a template for an 'unpivot' component? Or is something going to stop me from finishing the job? Is it possible to bring the column names with it?

Don't believe that I don't want you to have the pleasure to develop a component but I have just put a tUnpivot base on the tNormalize in the ecosystem. It's not very difficult but sincerely, it's maybe a little difficult as a first component.
I have only added an option to tNormalize in Advanced Settings.
Please tell me if it works like you want.
Regards,
One Star

Re: How do I 'unpivot' a table?

Thank you for this post. I was able to bring across my Key Column labels using this example, and avoided the tUnPivot and other pain I had there.
I modified this example as follows:
1. added a duplicate Input Delimited that skipped zero header rows, and has a limit of just 1. In other words, it only reads the header row (to get the column names)
2. added a tJavaRow from #1, doing a global PUT of the column headers. Just like the global PUT of the the Amt values in this example.
3. linked #1 ITERATE to the given example, hooking it into the tFileInputDelimited starting point.
Now I have the column headers flowing through alongside the "amount" values from the cross tab.
4. modify the tJavaRow to PUT a string I will parse later in tMap. For me, this is: globalMap.put("Amt1",globalMap.get("ColHeader1") + ";" + row1.amt1); etc through to as many as you need.
The ForEach is unchanged, essentially now providing a parse-able string (semi-colon delimited for me). Now modify the tMap to parse out the combo-value.
5. modify tMap. For ease of readability, I added a variable to get the value from the ForEach: Var.theValue = (String) globalMap.get(((String) globalMap.get("tForeach_1_CURRENT_VALUE"))). Now modify the mapping to parse out the "left" and "right" pieces of the string using the semicolon delimiter. Left side = "StringHandling.LEFT(Var.theValue,StringHandling.INDEX(Var.theValue,";")) ", right side = "StringHandling.RIGHT(Var.theValue,StringHandling.LEN(Var.theValue) - 1 - StringHandling.INDEX(Var.theValue,";")) ". I stored the "left" side into one of the output columns, and stored the "right" side into the other output column.
There you go. The column header labels are 'repeated' into a column. You can modify this as you need, to have as many extra columns headers you like. Just pull them out as per steps 1-3, and then concat them into a parse-able string as per step 4. Pull them back out again as per step 5.
I hope this helps.
--Eric
One Star

Re: How do I 'unpivot' a table?

Hi,
I have a similar requirement but with a little twist. I have all my metrics coming in one excel sheet in 3 different tables, see the first half of the image attached.
I wish to bring the Months and Years down, and move my 3 Metrics like DSO, etc in the columns, see second half of the image attached.
Kindly suggest how could i achieve the same.
Also the columns in the source i.e months and years would change once in a year when a new year's data would be added, and the old year's data would be removed keeping only 5 years of data at a time.
Your help would be really appreciated.

Thanks in advance,
Diwakar
One Star

Re: How do I 'unpivot' a table?

Your screenprint seems a bit dodgy: totals don't add up; the numbers don't align between the input and output; you refer to 5 years' data but there are 5 months...
I've assumed that's just because it's not a true example. The following is designed for 5 columns of values and 3 metrics but can be extended for more. There are other assumptions about sorting etc, which can be tweaked as needed. I've avoided tUnpivotRow as I prefer to only use "official" components, even though that might make the job design simpler.
tJavaRow_3
output_row.Metric = input_row.Metric;
output_row.Country = input_row.Country;
output_row.Value1 = input_row.Value1;
output_row.Value2 = input_row.Value2;
output_row.Value3 = input_row.Value3;
output_row.Value4 = input_row.Value4;
output_row.Value5 = input_row.Value5;
if (tos_count_tFileInputExcel_1<3)
globalMap.put(tos_count_tFileInputExcel_1==1?"Years":"Months",new String[] {input_row.Value1,input_row.Value2,input_row.Value3,input_row.Value4,input_row.Value5});

tFilterRow: Country Not Equal To "Country" And Country Not Equal To "Total"
tJavaRow_4
String[] Years  = (String[])globalMap.get("Years");
String[] Months = (String[])globalMap.get("Months");
output_row.Metric = input_row.Metric;
output_row.State = input_row.Country;
output_row.Values = Years+","+Months+","+input_row.Value1+";"
+Years+","+Months+","+input_row.Value2+";"
+Years+","+Months+","+input_row.Value3+";"
+Years+","+Months+","+input_row.Value4+";"
+Years+","+Months+","+input_row.Value5+";";

tNormalize_1: Values separated by ";"
tExtractDelimitedFields_1: Values separated by ","
One Star

Re: How do I 'unpivot' a table?

Hello
Here comes a Java scenario, please see the screenshots.
Input file:

TrxId;Name;Amt1;Amt2;Amt3
1;Car;10;45;50
2;Truck;30;60;15
3;Shong;10;20;30

Result:
Starting job topic_4500 at 11:14 24/10/2008.
1|Car|Amt1|10
1|Car|Amt2|45
1|Car|Amt3|50
2|Truck|Amt1|30
2|Truck|Amt2|60
2|Truck|Amt3|15
3|Shong|Amt1|10
3|Shong|Amt2|20
3|Shong|Amt3|30
Job topic_4500 ended at 11:14 24/10/2008.

Feel free to post your question!
Best regards
        shong

Hi shong,
can you explain your job?
Got the same problem:
ID;Column1,Column2,Column3;...;Column99
1;25;28;54;...;89
2;85;65;89;...;78


expected output:
ID;Column1
1;25
1;28
1;54
...
1;89
2;85
2;65
2;89
...
2;78