tSchemaComplianceCheck with Excel Date

Highlighted
Five Stars

tSchemaComplianceCheck with Excel Date

Hello,

I have to check multiple file sources (xls & xlsx) format. So I made a job that read all files from tFileList, put their content in a buffer with all columns defined as string.

Then I read the buffer and the flow enter the tSchemaComplianceCheck component in wich date columns have to match this pattern "EEE MMM d HH:mm:ss zzz yyyy". I made this pattern because, if I put a tLogRow preceding SchemaComplianceCheck, I noticed that date columns for Excel file came like this "Fri Mar 31 00:00:00 CEST 2017".

The job return "wrong DATE pattern or wrong DATE data" ... why ? Is it because the pattern expected a date in french format ?

 

If I put a tMap after reading my buffer with an output column defined as Date and this expression to fill it : TalendDate.parseDateLocale("EEE MMM d HH:mm:ss zzz yyyy", row9.PF0006,"en")   it works ! The only difference I see is that parseDateLocale have the "en" option to tell that the date is in english format.

 

Thanks to the community for any help !


Accepted Solutions
Eight Stars

Re: tSchemaComplianceCheck with Excel Date

Since Excel stores dates internally as long integers, I wonder if reading the date and converting it a String is giving you Excel's internal representation, rather than the actual date: that would explain why it worked after you explicitly parsed the field is a Date in Talend. Just a thought, YMMV.

David

All Replies
Eight Stars

Re: tSchemaComplianceCheck with Excel Date

Since Excel stores dates internally as long integers, I wonder if reading the date and converting it a String is giving you Excel's internal representation, rather than the actual date: that would explain why it worked after you explicitly parsed the field is a Date in Talend. Just a thought, YMMV.

David
Five Stars

Re: tSchemaComplianceCheck with Excel Date

Yes, but if I don't read date as a string I can't check date format with schemaComplianceCheck.

Anyway, it's the same problem for numeric values that cames with a coma rather than point or with spaces between numbers depending on how the producer of the Excel file build it. I'm gonna write java routines to handle all of these issues.

As your are the only one to answer, it seems that my problem did not raise interest in the community, so I'll close this subject !

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

6 Ways to Start Utilizing Machine Learning with Amazon We Services and Talend

Look at6 ways to start utilizing Machine Learning with Amazon We Services and Talend

Blog

Why Companies Move to the Cloud: 7 Success Stories

Learn how and why companies are moving to the Cloud

Read Now