One Star

How to count the columns of a csv file?

Hi,
I have a problem : I need to count the columns of a csv file because, when I check it the tSchemaCompliance component, I got no error.
I found the origin of the problem : some "\t" where inserted in some fields before the data were exported out of the the DB. Now, I must re-import the data (after an update made by an other company).
The tSchemaCompliance didn't check if some columns didn't exist for some rows. Is there an option to check that?
I try with a Metadata file or with an integrated file : same result : no error, because the column A is in the B place and A,B have the same long type or if the 2 columns are empty (or one of the 2 is empty).
How can I find the rows where the B column didn't exist (or when I got 35 columns instead of 36 columns)?
The problem can also be : I got too much columns (wiht the \t inserted in some file and I didn't check if people add \t in a field (by copy/paste from an other web application for example : I just get the csv file).
Sorry for my bad english,
Thanks for your help,
6 REPLIES
Seven Stars

Re: How to count the columns of a csv file?

The tFileInputDelimited component has an Advanced setting "Check each row structure against schema" that you can combine with a rejects flow to identify records with extra or missing fields.
One Star

Re: How to count the columns of a csv file?

Thanks. I will try.
One Star

Re: How to count the columns of a csv file?

Thanks, it's seems ok.
A little bit hard because 60000 rows going wrong and 3 are ok ;-) after creating the schema with the wizard....
I will investigate.
One Star

Re: How to count the columns of a csv file?

Hi Alevy
I am new to talend and I have faced similar issue as yours. I have table with columns name,addr1,addr2,company and am passing data in csv as
name,addr1,addr2,company
q3,sdkjh,ad2
q5,sdkjh,ad2,c1,c2,c5
q8,sdkjh,ad2,c1
and the third row is getting inserted correctly after i checked "Check each row structure against schema". But when i connected the reject flow to a tFileOutputDelimited I am getting as
q3,sdkjh,ad2,,,Column(s) missing - Line: 0
q5,sdkjh,ad2,c1,,Too many columns - Line: 1
It is cutting down the data in the extra rows and putting blank values instead ... I want the rejected data file to look like
q3,sdkjh,ad2Column(s) missing - Line: 0
q5,sdkjh,ad2,c1,c2,c5Too many columns - Line: 1
Is there any way to get in the above fashion? If not using "Check each row structure against schema" can we use any other so that I can filter out the ones with the correct data into one csv and the wrong ones into another csv.
Thanking you
Nishanth
Seven Stars

Re: How to count the columns of a csv file?

I don't believe so.
One Star

Re: How to count the columns of a csv file?

Hi,
I have a similar query, where I need to check if my csv file has got how many columns failing to which raise a reject. By checking "check each row structure against schema" in tfileinputdelimited it is not working. As in it will put into reject file however not with the proper error message like - columns missing"
Is there any other solution for this ?
Thanks