I have to check multiple file sources (xls & xlsx) format. So I made a job that read all files from tFileList, put their content in a buffer with all columns defined as string.
Then I read the buffer and the flow enter the tSchemaComplianceCheck component in wich date columns have to match this pattern "EEE MMM d HH:mm:ss zzz yyyy". I made this pattern because, if I put a tLogRow preceding SchemaComplianceCheck, I noticed that date columns for Excel file came like this "Fri Mar 31 00:00:00 CEST 2017".
The job return "wrong DATE pattern or wrong DATE data" ... why ? Is it because the pattern expected a date in french format ?
If I put a tMap after reading my buffer with an output column defined as Date and this expression to fill it : TalendDate.parseDateLocale("EEE MMM d HH:mm:ss zzz yyyy", row9.PF0006,"en") it works ! The only difference I see is that parseDateLocale have the "en" option to tell that the date is in english format.
Thanks to the community for any help !
Solved! Go to Solution.
Yes, but if I don't read date as a string I can't check date format with schemaComplianceCheck.
Anyway, it's the same problem for numeric values that cames with a coma rather than point or with spaces between numbers depending on how the producer of the Excel file build it. I'm gonna write java routines to handle all of these issues.
As your are the only one to answer, it seems that my problem did not raise interest in the community, so I'll close this subject !
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.