Convert Excel file to CSV file whose metadata varies each month

Four Stars

Convert Excel file to CSV file whose metadata varies each month

Hello Team,

 

I have a requirement like every month we get excel file and we need convert this into csv file for further process. Source and Target file structure like below.

Month 1:
SourceFile.xlsx
CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE1,CODE2,CODE3,CODE4

TargetFile.csv
CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE1,CODE2,CODE3,CODE4

Month 2:
SourceFile.xlsx
CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE2,CODE4,CODE5,CODE6,CODE7,CODE10

TargetFile.csv
CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE2,CODE4,CODE5,CODE6,CODE7,CODE10

Note: We would get every time more than 100 codes in the excel file. Also actual header starts from B8 cell in the excel.

But we shouldn't change the schema metadata  for every monthly run.

Once we loaded to csv file, then it should load into DB table and It is like below 

CUSTID,CUSTNAME,DOB,GENDER,StepFlag,PreviousStepFlag,CODE

For each code 1 record to be inserted, for 4 codes 4 records to be inserted, 6 codes 6 records to be inserted, n codes n records to be inserted.

Please help us to handle this scenario.

 

Regards,

NarsimhaReddy

 

Six Stars

Re: Convert Excel file to CSV file whose metadata varies each month

You can use Dynamic schema feature of the Talend Enterprise Edition
Eleven Stars

Re: Convert Excel file to CSV file whose metadata varies each month

@MNREDDY 

 

There is no single component to do so  but you could program it , thanks to tFileexcelInput , Start and end column Properties .

 

1) Read 1st 7 key column. Add sequence number

SEQNO CUSTID CUSTNAME DOB GENDER StepFlag PreviousStepFlag

 

2) You could get number of Column in excel using below command

new org.apache.poi.xssf.usermodel.XSSFWorkbook((String)globalMap.get("filename")).getSheet("SheetName").getRow(rownumber).getLastCellNum()

3) Now as you know column count , you could read columns in loop with set of 50 (thanks to tFileexcelInput , Start and last column Properties ).

e.g. if number of code columns is 60 , read 1st 50 code columns( startcolumn 8 ,endColumn 58) and next iteration 10 ( startcolumn 58 , endColumn 108)

4)

4.1) while reading , again add SEQNO

SEQNO CODE1 CODE2 CODE3 CODE4 ...

4.2) normalize the result  ( tSplitRow)

SEQNO CODE1

SEQNO CODE2

SEQNO CODE3

4.3) you can use tMap to do inner join(all match) between step1 and step 4.2 result on SEQNO

SEQNO CUSTID CUSTNAME DOB GENDER StepFlag PreviousStepFlag CODE1value

SEQNO CUSTID CUSTNAME DOB GENDER StepFlag PreviousStepFlag CODE2value

 

Above result you can update on table.

Hope this will be helpful to start.Let me know if you face problem.

 

 

 

Regards
Abhishek KUMAR
Four Stars

Re: Convert Excel file to CSV file whose metadata varies each month

@dipanjan93

We have tried this option, but  it is unable to recognize all the columns as we have 90 columns in the sheet. It is reading 12 columns only.

 

Regards,

NarsimhaReddy

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog