Load Multiple CSV files with different Structure

One Star

Load Multiple CSV files with different Structure

Hi all,
We have a project requirement where we are required to load 100+ CSV files into the oracle database. All 100 CSV files have different structure. Before loading the files into the database we need to perform basic validations like, 
- for Date column check if date is in correct format
- trim the white spaces for strings
- check if numeric data types columns actually consists of numbers
Now I have 2 queries:
- I am thinking of storing the name and schema of the CSV files in database. Read that meta data information of the file and then depending upon the meta data, load the data into the oracle tables. This way tomorrow is 100+ new CSV files are added for loading, I do not have to create a new job everytime.
Please suggest how it can be achieved. 
Moderator

Re: Load Multiple CSV files with different Structure

Hi,
Have you already checke talend dynamic schema feature which allows you to design schema with an unknown column structure (unknown name and number of columns).
Here is the article about:TalendHelpCenter:How to process changing data structure?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Load Multiple CSV files with different Structure

I am aware of it but this does not allow me to perform any data quality checks on the Data. This will be a simple load from file to the database or vice versa.
Thanks,
Sankalp
Six Stars

Load CSV file with different Structure to multiple Table

Hello,

         How can I import data from excel or csv file and dump data in two tables  State Name, State Code, City Name, City Code in State_City table and Sr No, Name, Address, Department, Designation, Date of Joining in Emp table? Here is the attached file.

Employee

Re: Load CSV file with different Structure to multiple Table

Hi Dhara,

  

        You will require two logical operations in this context.

 

      After reading the file (while reading, specify limit value as 5 so that you will read only header part), you need to push the data to a tjavarow or tmap. Since State Name is in column A and value is in column B, you can do a comparison operation. ie, if input_row.columnA is equal to "State Name:" then output_row.statname=input_row.columnB. Similar operation can help you to parse other header details.

 

     For reading the detailed content, read the data using texcelinput component again but this time specify the header as 5. In this way, you will be skipping the header components. 

 

     Hope this can resolve your issue. If it is working, please mark as accepted solution so that it will help Talend community.

 

Warm regards,

Nikhil Thampi

    


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Tags (1)
Six Stars

Re: Load CSV file with different Structure to multiple Table

How can I write if else condition in tJavaRow component?

if (input_row.A=='State Name:')
{
output_row.statename = input_row.B;
System.out.println(output_row.statename);
}
else if (input_row.A=='City Name:')
{
output_row.cityname = input_row.B;
System.out.println(output_row.cityname);
}

I wrote this code but getting error 'Invalid Character constant'.

See the screenshot

2.PNG

Employee

Re: Load CSV file with different Structure to multiple Table

Hi,

 

You have assigned the value as constant and that is the reason for error. You will have to try like below in tjavarow.

 

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}

 

If you think the suggestion has worked, please mark the topic as acceptable solution.

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Load CSV file with different Structure to multiple Table

I am getting duplicate rows when I dump data into Oracle table and I am not getting state code value and city code value.

This is my code.

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("State Code:"))
{
output_row.statecode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("City Code:"))
{
output_row.citycode = input_row.D;
}

Six Stars

Load Excel file with different Structure

I am getting duplicate rows when I dump data into Oracle table and I am not getting state code value and city code value.

This is my code.

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("State Code:"))
{
output_row.statecode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("City Code:"))
{
output_row.citycode = input_row.D;
}

Can you help me to resolve this?

 

 

 


E1.PNG

Output:

StateName  CityName StateCode CityCode

Gujarat        null             123           null

Gujarat        abc            123           234

Gujarat        abc            123           234

 

Thanks,

Employee

Re: Load Excel file with different Structure

Hi Dhara3010,

 

        You will have to denormalize the dataset to get a single row of data. I have used the same excel sheet provided by you in your previous chat and I got the result set in single line. I have attached the sample job for your reference as zip file. Please make necessary changes according to your need.

 

         image.png

And the result I got is as shown below (you may have to reformat the date column according to your choice).

 

 image.png

The code used in tjavarow is as shown below.

 

//Code generated according to input schema and output schema
output_row.merge_data="Yes";
output_row.statename = "";
output_row.cityname = "";
output_row.statecode = "";
output_row.citycode = "";
output_row.fromdate = "";
output_row.todate = "";

 

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("From Date:"))
{
output_row.fromdate = input_row.B;
}

if (!Relational.ISNULL(input_row.C) && input_row.C.equals("State Code:"))
{
output_row.statecode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("City Code:"))
{
output_row.citycode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("To Date:"))
{
output_row.todate = input_row.D;
}

 

 

Warm Regards,

 

Nikhil Thampi

 


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Load Excel file with different Structure

Hi Nikhil Thampi,

Thanks a lot for this solution. It really solved my problem. It is the best solution.

 

Thanks,

Dhara

Employee

Re: Load Excel file with different Structure

Hi,

 

     Happy to help :-)

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Load Excel file with different Structure

Hi,

    What if State Name is State Name1(Column name mismatch) in file or there is no data in file or value is not found or format is mismatch or full file is blank or record not found or improper file. How can I handle such file and data? i.e. Error handling

 

Thanks,

Dhara

Employee

Re: Load Excel file with different Structure

Hi,

 

     You will have to invoke the subsequent error flows based on the different business rules you are planning to create.

 

For example, you can check the number of records from incoming excel data file by checking the value of pseudo variable ((Integer)globalMap.get("tFileInputExcel_1_NB_LINE")). If the value of this variable is zero, then you can go for the error flow. Similarly you will have to define each error flow in your mind and catch the errors using Talend error handling components.

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Load Excel file with different Structure

Hi,

My file template is given below

1.PNG

Now, If I get some other file format like this below

2.PNG

then what condition should I check and which palette should I use to move this file in Error Folder.

And if my file is half run i.e from 5 rows, only 2 are inserted then rows should be rollback and file should be moved to Error Folder. I am dumping data from excel file to sql server.

Can anyone suggest me the solution as soon as possible?

 

Thanks,

Dhara

Six Stars

Re: Load Excel file with different Structure

Hi,

What condition should I mention to move file to error folder if I don't get following all columns mentioned in the condition,

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("From Date:"))
{
output_row.fromdate = input_row.B;
}

if (!Relational.ISNULL(input_row.C) && input_row.C.equals("State Code:"))
{
output_row.statecode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("City Code:"))
{
output_row.citycode = input_row.D;
}
else if (!Relational.ISNULL(input_row.C) && input_row.C.equals("To Date:"))
{
output_row.todate = input_row.D;
}

Six Stars

Re: Load Excel file with different Structure

Can anyone help me for this? I need the solution by today as I need to complete the requirement.

Employee

Re: Load Excel file with different Structure

Hi Dhara,

 

   You can add a flag column to verify whether the file is good or bad.

 

context.file_flag="Y" (at starting point for each file read in a tjava)

 

if (!Relational.ISNULL(input_row.A) && input_row.A.equals("State Name:"))
{
output_row.statename = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("City Name:"))
{
output_row.cityname = input_row.B;
}
else if (!Relational.ISNULL(input_row.A) && input_row.A.equals("From Date:"))
{
output_row.fromdate = input_row.B;
}

 

if ( Relational.ISNULL(output_row.statename ) && Relational.ISNULL(output_row.statename ) ...... add other columns according to your need )
{
context.file_flag="N";
}

 

Add a run if condition and verify whether the file_flag value is Y or N. Based on that value, you can either mark it as good file or bad file.

 

Warm Regards,

 

Nikhil Thampi

 


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Load Excel file with different Structure

Thanks Nikhil Thampi for the solution

Six Stars

Re: Load Excel file with different Structure

Hi Nikhil,

4.PNG

I am not able to dump this data  in my table. I am getting error as shown below

5.PNG

Six Stars

Re: Load Excel file with different Structure

Hi Nikhil,

My requirement is to dump data from excel file to two table(the file which I sent you before again attaching the same file). The upper part

State Name:GujaratState Code:123From Date:7/19/2018
City Name:hbhjbjhCity Code:567To Date:20-07-2018

in one table 

Sr NoNameAddressDepartmentDesignationDate of Joining

and this in other table. When my Job is executed the file should be moved to InProcess folder dump data in both table and if successful then move file in Process folder else in Error folder.
I even need to handle the datatype as in excel it is by default string. I even have an amount column. Even I need to handle all of the error conditions which may occur like out of 5 only 2 inserted or if the data is inserted in one table and not in other table. I do need to check all the validations part even for date column, Amount only number can be allowed otherwise move file in error folder. Even I need to maintain the log in third table like filename, status(processed or error) and reason for error like cause for error.

Can you suggest me the solution for this? as I really struggling for this task from 2 weeks maybe. Actually I am struggling with the error part to handle.

Employee

Re: Load Excel file with different Structure

Hi Dhara,

 

     Apologies for the delay as I am currently traveling and didnt get chance to look my community messages. The first step we have to do is to parse the file and split the records (which I have already specified in my previous post). You can use a tjavarow to make this happen.

 

      You should do all the verification for null check in second tjavarow or tmap (after denomalization component) so that it will not become one big java custom code. You can use standard java if else conditions and data type check in this second tjavarow/tmap. Based on the condition validation, you can use run if to move the data to your target/reject area.

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Load Excel file with different Structure

Hi,

5.PNG

Can you give me solution for this date error?

Employee

Re: Load Excel file with different Structure

Hi dhara,

 

      You will have to change/parse the date value from String to date using Talend/Java date conversion functions. I would suggest you to take Talend DI Basics and Advanced Training program as it will help you to overcome these hurdles.

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Load Excel file with different Structure

Thanks I solved my problem of date parse

Six Stars

Re: Load Excel file with different Structure

Hi,

How to get rejected records count of tconverttype?

 

For eg: There is reject count for FilterRow component "((Integer)globalMap.get("tFilterRow_1_NB_LINE_REJECT")).equals(0)".

Is there any such code for tconverttype reject rows count?

 

Thanks,

Dhara

Employee

Re: Load Excel file with different Structure

Hi Dhara,

 

       Please start a new thread for each issue as it will help in understanding and replying to that specific context. Else the current thread will be so huge that it will become difficult to manage.

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Load Excel file with different Structure

Hi,

File format error is not handled by the solution you gave.

I tried this as shown in screenshot, but didn't work well

F1.PNG

Can you suggest some other way to handle this file and print message "Invalid File Format"?

 

Thanks,

Dhara

Moderator

Re: Load Excel file with different Structure

Hello,

tConvertType provides a variable like this: ((Integer)globalMap.get("tConvertType_1_NB_LINE")).

It should be the number of input rows.

Best regards

Sabrina

 

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.