Check number of fields for each row in CSV file

Four Stars

Check number of fields for each row in CSV file

Hello all,

 

 I want to perform some quality control on my CSV files. One of the controls is to check the number of fields for each row.

 

Here is one example of a CSV file :

Adresse_1,Adresse_2,CP,Ville
23 rue de la pierre,,75001,Paris
25 rue de la pierre,,75001,Paris
27, rue de la pierre,,75001,Paris
29 rue de la pierre,,75001,Paris
31 rue de la pierre,,75001,Paris

Field delimiter is the Comma. And there is supposed to be 4 fields.

The line with 27 has an extra comma which creates a mess.

 

In the TOS, I defined that the Headers must be used as column names.

But the result I have is (for the resulting columns) :

Adresse_1    Adresse_2     CP     Ville    Column 4

This is not what I want.

 

What I want to have is a process that would remove the lines the the incorrect number of separators from the input file, and create 2 output files :

 - One with all the correct lines

 - One with the wrong lines (to correct the errors)

 

Is it possible ?

 

Thanks for your help.

Four Stars

Re: Check number of fields for each row in CSV file

Hi.

 

What do you need to filter the lines that have more than 4 separators? that is, in this case, do you want to have two files with the lines.

 

23 rue de la pierre,,75001,Paris
25 rue de la pierre,,75001,Paris
29 rue de la pierre,,75001,Paris
31 rue de la pierre,,75001,Paris

 On the other file:

27, rue de la pierre,,75001,Paris

Because you have the control of separator count's and you can filter by count.

 

This is what you need?

 

Regards!

Six Stars

Re: Check number of fields for each row in CSV file

Hello..

Apply filter on tmap 

corrected lines : StringHandling.COUNT(row1.ADRESSE_1,",") !=1

Wrong lines : StringHandling.COUNT(row1.ADRESSE_1,",")==1

 

1.PNG2.PNG

 

Hope this solves your purpose...

Regards