Duplicate column names - xlsx

One Star

Duplicate column names - xlsx

Hello,

I'm using the excellent tFileExcelSheetInput component but I'm having an issue. It is not with the component itself but with the incoming data.

tFileExcelSheetInput allows me to 'search' for column names across the top row of each sheet, and then return the data under that column heading into my flow (which is great).

My issue is that the incoming spreadsheet can contain duplicate column names. tFileExcelSheetInput appears to be taking the right hand most column with that heading and returning that data. As I said, it isn't really the component I have an issue with it is the duplicate column names.. The duplicated column names have different data below.

So I can have column headings something like:

Name
Tel
Mobile
House Number
Name
Address1
Address2
etc..


The column positions can change from sheet to sheet - hence why I am using tFileExcelSheetInput.
As you can see above, the Column name 'Name' is repeated.
tFileExcelSheetInput will return the second 'Name' data (applies to a house name here and is hardly ever populated) rather than the persons 'Name' data.

I am getting around this by reading the top row, pivoting it, adding a numeric counter (to maintain its position) then doing a tUniqRow to identify duplicates, then renaming the duplicate with a 1,2,3etc re-forming the column, order it on the numeric counter to get the column positions correct, and then unpivot back to a spreadsheet, where I then append the data from the original worksheet.

In this example I end up with column names:

Name
Tel
Mobile
House Number
Name1
Address1
Address2
etc..

This is a very roundabout process.

Does anyone know of a better way of dealing with the duplicate column names?

many thanks

Ian

Community Manager

Re: Duplicate column names - xlsx

Hi Ian
I think you have to identify the uniqueness of column whatever way you use with community version. You can take a look at the dynamic schema feature in Talend Subscription product.

Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog