Five Stars

tSchemaComplianceCheck with Excel Date

Hello,

I have to check multiple file sources (xls & xlsx) format. So I made a job that read all files from tFileList, put their content in a buffer with all columns defined as string.

Then I read the buffer and the flow enter the tSchemaComplianceCheck component in wich date columns have to match this pattern "EEE MMM d HH:mm:ss zzz yyyy". I made this pattern because, if I put a tLogRow preceding SchemaComplianceCheck, I noticed that date columns for Excel file came like this "Fri Mar 31 00:00:00 CEST 2017".

The job return "wrong DATE pattern or wrong DATE data" ... why ? Is it because the pattern expected a date in french format ?

 

If I put a tMap after reading my buffer with an output column defined as Date and this expression to fill it : TalendDate.parseDateLocale("EEE MMM d HH:mm:ss zzz yyyy", row9.PF0006,"en")   it works ! The only difference I see is that parseDateLocale have the "en" option to tell that the date is in english format.

 

Thanks to the community for any help !

1 ACCEPTED SOLUTION

Accepted Solutions
Six Stars

Re: tSchemaComplianceCheck with Excel Date

Since Excel stores dates internally as long integers, I wonder if reading the date and converting it a String is giving you Excel's internal representation, rather than the actual date: that would explain why it worked after you explicitly parsed the field is a Date in Talend. Just a thought, YMMV.

David
2 REPLIES
Six Stars

Re: tSchemaComplianceCheck with Excel Date

Since Excel stores dates internally as long integers, I wonder if reading the date and converting it a String is giving you Excel's internal representation, rather than the actual date: that would explain why it worked after you explicitly parsed the field is a Date in Talend. Just a thought, YMMV.

David
Five Stars

Re: tSchemaComplianceCheck with Excel Date

Yes, but if I don't read date as a string I can't check date format with schemaComplianceCheck.

Anyway, it's the same problem for numeric values that cames with a coma rather than point or with spaces between numbers depending on how the producer of the Excel file build it. I'm gonna write java routines to handle all of these issues.

As your are the only one to answer, it seems that my problem did not raise interest in the community, so I'll close this subject !