One Star

[resolved] Unparseable Date format in csv

Hello,
I am trying to read a csv file using talend open studio. All the string and date fields in the file are quoted. Sample Data:
1763,"abce","dfg","xyz","mjo","abcddrer","abc.com","Sunday","Overflow","11/29/2012","01/07/2013","80/443","4093151","5344"
The date fields come in quotes, when I try to read I get an error:
java.lang.RuntimeException: java.text.ParseException: Unparseable date: ""11/29/2012""
        at routines.TalendDate.parseDate(TalendDate.java:864)
        at routines.TalendDate.parseDate(TalendDate.java:808)
How to read this kind of file, I understand the error is coming due to quotes 2 times around the date.

Thanks in advance.
1 ACCEPTED SOLUTION

Accepted Solutions
One Star

Re: [resolved] Unparseable Date format in csv

Hi,
I found a solution. In date column of output file i mentioned
row1.M_MODIFIED_DT!=null?
row1.M_MODIFIED_DT.split("\\."):null        
and we will be having desired output
Thanks
shivaraaj
9 REPLIES
Community Manager

Re: [resolved] Unparseable Date format in csv

Hi
It is a CSV file, you need to check the csv options on tFileInputDelimited, and set the Éscape char as """, Text closure as """, read the date column with Date type, and set the pattern as "MM/dd/yyyy"
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: [resolved] Unparseable Date format in csv

My input is HDFS Input File and not tInputdelimited file, any work around this.
Mp
Community Manager

Re: [resolved] Unparseable Date format in csv

Read all data with string type, then remove the double quote on other component such as tMap, for example:
tHDFSInput--main--tMap_1--main--tMap_2--->tLogrow
on tMap_1:
row1.columnName.contains("\"")?row1.columnName.replaceAll("\"","")
afterwards,  parse the date column to a Date if needed on tMap_2, 
for example:
TalendDate.parseDate("MM/dd/yyyy",row1.columnName)
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: [resolved] Unparseable Date format in csv

Hi,
This does not work, now it throws new error:
Exception in component tMap_36
java.lang.RuntimeException: java.text.ParseException: Unparseable date: ""
        at routines.TalendDate.parseDate(TalendDate.java:864)
        at routines.TalendDate.parseDate(TalendDate.java:808)
This is what I have done to test it. In the tMap component, declared a Variable -- lv_my_date (datatype as String) in Var
say lv_my_date and gave value as row40.my_date.contains("\"") ? row40.my_date.replaceAll("\"","") : row40.my_date
Finally used TalendDate.parseDate("MM/dd/yyyy", Var.lv_my_date) and assigned it to my_date (datatype as Date)
I think it is replacing both the quotes while parsing the string, hence this error, but really not sure what is happening.

Mp
Community Manager

Re: [resolved] Unparseable Date format in csv

Hi 
There should be some empty value in this column, to resolve it,  you need to check if the value is empty or not before parsing it to a Date, for example:
Var.lv_my_date.equals("")?null:TalendDate.parseDate("MM/dd/yyyy", Var.lv_my_date)

//Set it to null if it is empty or you can also set it to a default value.

Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: [resolved] Unparseable Date format in csv

Hi,
This resolves the date but cannot read the file for the same reason: The input file is in hdfs and it is csv format.
One sample record is shown here:
1001,"Appl","Max","abc","bbc","110.100.120.34","abc.com, bbc.com","Self","Test Region","11/08/2012","05/05/2014","80/443",,"5678"
All the string fields are in quotes. Please notice value "abc.com, bbc.com"
job fails with reason--
Exception in component tMap_36
java.lang.RuntimeException: java.text.ParseException: Unparseable date: "Test Region"
As we understand due to comma, the field has moved one place. 
Is there any solution for this?
-mp
Community Manager

Re: [resolved] Unparseable Date format in csv

Hi
As we understand due to comma, the field has moved one place.

Read all data with string type from HDFS Input file and write the data to a file first, and then read the data back again with CSV format, for example:
tHDFSInput--main--tFileOutputDelimited
   |
onsubjobok
   |
tFileInputDelimited--main--tMap--main-->other business processing
on tFileInputDeimited: read the data back again with CSV option.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: [resolved] Unparseable Date format in csv

Hi shong,
I'm facing the same kind of scenario in which my input csv(text file) contains C_Customer_id M_Modified_date. In M_Modified_date column the date and time is in 2015-10-07 15:33:24.653. I need to remove the millisecond from the time and some date field contains null values too. So in output csv i need the same null value. I tried 
null==row1.M_MODIFIED_DT?
null:TalendDate.parseDate("yyyy-MM-dd hh:mm:ss",row1.M_MODIFIED_DT)     
but i got error stating "Type mismatch cannot convert from date to string".
Input csv
C_Customer_id|M_Modified_date
1001|2015-10-07 15:33:24.653
1002|NULL
My output should be
C_Customer_id|M_Modified_date
1001|2015-10-07 15:33:24
1002|NULL
is there any possibility to convert in this format.
Thanks
shivaraaj
One Star

Re: [resolved] Unparseable Date format in csv

Hi,
I found a solution. In date column of output file i mentioned
row1.M_MODIFIED_DT!=null?
row1.M_MODIFIED_DT.split("\\."):null        
and we will be having desired output
Thanks
shivaraaj