tFileInputExcel dynamic schema

One Star

tFileInputExcel dynamic schema

I am presented with a bit of a challenge and I am unsure if Talend has the ability to do what I want.
My problem is that I need to accept an excel file, whose columns will change, and import that data into a database.
The excel file will have all of the column names listed in the header row.
The format of the excel file will be as follows:
key_id, company_name, company_type, contact1_name, contact1_address, contact1_phone, contact2_name, contact2_address, contact2_phone
Each row will relate to two database tables. "companies" and "contacts" The "key_id" will be used as the surrogate key on both tables. The column names in the database match the column names in the excel file, except for the prefix. So, "company_name" would map to the "name" column on the "companies" table. Similarly, the "contact1_name" excel column will map to the "name" column on the "contacts" table.
One of my challenges is being able to accept 1:N contacts. The above example would have 2 contacts, but if they have 3 contacts, there would be another set of the same columns with a different prefix.
e.g.
key_id, company_name, company_type, contact1_name, contact1_address, contact1_phone, contact2_name, contact2_address, contact2_phone, contact3_name, contact3_address, contact3_phone

This would be the same pattern for 4,5,6...N number of contacts. Each set of "contact#_*" excel columns would result in their own row in the "contacts" table in the database.
If this would be possible in Talend, that would be a huge hurdle for me.
Unfortunately, that's not my only problem. Along with that uncertainty, another feature I need to work out is how to account for our customers entering in more or less columns for the company information.
e.g.
key_id, company_name, company_type, company_phone, company_email, company_fax, contact1_name, contact1_address, contact1_phone, contact2_name, contact2_address, contact2_phone

The "companies" table would have all of the possible columns that may be entered in the excel spreadsheet. Only the columns that are present will get inserted into the database. So, the first example I showed would create a row in the "companies" table with "name" and "type" populated, but nulls for the "phone", "email", and "fax" columns, while this example would have all of the fields populated.

I'm not sure how to accomplish either of these tasks using the tFileExcelInput component. When creating a "File Excel" metadata component, on "Step 3 of 4" there is a checkbox to "Set heading row as column names." I am able to use this feature to automatically set the schema for my tFileInputExcel component, but this only allows me to save one possible set of columns. It would be awesome if there was a way to use this function to set the schema for the excel file being inputted on the fly. Ideally, I could set up the possible list of columns to be accepted beforehand, and when the job is run, have Talend determine which columns are present out of the available set, and insert those columns into the related database tables.
Any advice would be much appreciated!
Moderator

Re: tFileInputExcel dynamic schema

Hi,
I think you requirement refers to dynamics column, see the related forum Forum 28276.
As we known that dynamic schema feature is only available in Talend Enterprise Subscription Version, it allows you to design schema with an unknown column structure (unknown name and number of columns), however,only tFileInputDelimited and tFileInputPositional component support for it.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Seventeen Stars

Re: tFileInputExcel dynamic schema

Hi, I plan this feature for my component tFileExcelSheetInput next week like i did that already for the component tFileInputTextFlat. The schema columns will reorganized by the position of the columns in the header. In my current project I need this feature also for Excel files. Both components are available in Talend Exchange.
One Star

Re: tFileInputExcel dynamic schema

Sabrina: I have the Talend Enterprise Data Integration Professional edition, so I do have access to the dynamic schema features. However, like you said, I couldn't find a way to use it with tFileInputExcel, only with only tFileInputDelimited and tFileInputPositional. I'll give the post you linked a look over.
jlolling: Your custom components look very promising for what I am trying to do. I look forward to seeing the next version of your tFileExcelSheetInput component that contains the column reorganizing feature. I'll give your current versions a test drive to gain familiarity with them while you finish up your additions.
Thank you both for the responses!
Moderator

Re: tFileInputExcel dynamic schema

Hi,
Talend don't support the dynamic schema for tFileInputExcel component yet, someone may ask which components supports the dynamic schema? You can find the answer in this file: <Talend Studio install dir>/plugins/org.talend.core.tis_x.x.x.rxxxxx.jar.
unzip this jar to a directory, and you will see a file call supportDynamic.txt under resources folder, the file lists all the components support the dynamic schema in the current version.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Seventeen Stars

Re: tFileInputExcel dynamic schema

The version of tFileExcelSheetInput (v 3.0) with capabilities to adjust the column position by a header row is available in Talend Exchange.
One Star

Re: tFileInputExcel dynamic schema

Thank you for the update jlolling, I'll check out your updated component Smiley Happy
One Star

Re: tFileInputExcel dynamic schema

i have two excel input data and have to implement scd(slowly changing dimension),that if i update one of the column value say loc from london to neyyork in my original sheet,it must get reflected to my output excel sheet.i dont want to use predefined scd components such tmysqlscd and so on.
will u please help me how to resolve this issue??
Seventeen Stars

Re: tFileInputExcel dynamic schema

Actually there is no way with build-in components to do this job. If I had to solve this I would indeed use a database table to implement the SCD logic and create the content of the output Excel file from scratch new with the content of the database table.
The problem without a database is to implement the find and update method for a data set in excel. For sure this is possible but needs individual java programming. At the moment I cannot see enough other use cases (other then your request) for it to start a project solving this task.
One Star

Re: tFileInputExcel dynamic schema

Actually there is no way with build-in components to do this job. If I had to solve this I would indeed use a database table to implement the SCD logic and create the content of the output Excel file from scratch new with the content of the database table.
The problem without a database is to implement the find and update method for a data set in excel. For sure this is possible but needs individual java programming. At the moment I cannot see enough other use cases (other then your request) for it to start a project solving this task.

Hi Jlolling,
Thanks for the dynamic schema update.
In an excel file comparison to find mismatches i use tmap component.If i use your excel component with dynamic schema Please tell whether i will able to use the same tmap componenet for any excel comparison.
Eg:- I create a job for excel comparison with tmap and i have done all config.
So whenever i change the excel to be compared will the tmap componenet automatically configures based on the columns from excel.?
Thanks,
Raakesh R
Seventeen Stars

Re: tFileInputExcel dynamic schema

Hi Raakesh,
it is a bit hard to understand what do you mean?
What do you mean with "... any excel comparison" ?
One Star

Re: tFileInputExcel dynamic schema

Hi jlolling,
could you add the regexp option to the tFileExcelSheetInput like you did for tFileInputTextFlat component?
Those components are great.
Thanks
Sebastien
Community Manager

Re: tFileInputExcel dynamic schema

Sebastien, it is better to add a comment to the component page on Talend Exchange.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Seventeen Stars

Re: tFileInputExcel dynamic schema

I am currently developing that because of the request of a couple of developers. It is not a big deal because the functionality is already developed for another component.
I will test it a while and this week (probably tomorrow) I will publish a new release with this feature.
One Star

Re: tFileInputExcel dynamic schema

I want to know if talend can do this and if it can how to do it?
I have an exel sheet having columns First name, last name, street, city, country, state and phone no. I want to insert this data, (based on id in a database table related to the name fields in exel sheet) into another database table. I want to know if we can write a query on a database table to find the ids related to the names in exel sheet and then inserting the resulting records from exel sheet to database table.
Moderator

Re: tFileInputExcel dynamic schema

Hi,
I want to insert this data, (based on id in a database table related to the name fields in exel sheet) into another database table. I want to know if we can write a query on a database table to find the ids related to the names in exel sheet and then inserting the resulting records from exel sheet to database table

Could you please set an example for your requirement?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: tFileInputExcel dynamic schema

Example
Suppose I already have a table in my database which contains all columns as there are in the exel sheet except one more column named id. Now when I import this sheet to a table in another database, I would like to import only those records that have names related to the ids in first table. 
That is, I want to check at run time whether the names in exel sheet are present in a predefined database table, and if it is, then I want the related Id of that name to be inserted into the id field of the output database table.