Excel Dates - is there an auto-magic way of loading Excel files

Hi,
having some problems with Excel and its dates format on Talend 4 using java jobs.
how should i do to import Excel files with java jobs.
thaks
Nicolas
20 REPLIES

Re: Excel Dates - is there an auto-magic way of loading Excel files

bumping !!
any suggestions at all?
really stuck here
this is a sample of the data i get when reading various Excel files that need to be processed into DB:
this is by setting the schema as String and loading into Talend only - not trying to convert as Dates

|excelDates |counting|
|12-Jul-2010 |376 |
|23/05/2010 12:03:00|3 |
|15/04/2010 10:30:00|3 |
|28/04/2010 11:35:00|3 |
|29/05/2010 10:55:00|3 |
|25/05/2010 04:38:00|3 |
|6/15/10 17:05 |3 |
|21/05/2010 09:30:00|3 |
Smiley Sad

Re: Excel Dates - is there an auto-magic way of loading Excel files

with a Perl job, you would just need to include the Date::Manip module and then in a tmap output table put this code in to "normalize" the dates:
Date::Manip:Smiley Very HappyateCalc( $row1 , "+ 0 Day");
Date::Manip is smart enough to interpret the different formats and return the date and time in a consistent format.

I'm not sure if there's a Java equivalent to this solution, maybe one of the Java guru's here can chime in.
One Star

Re: Excel Dates - is there an auto-magic way of loading Excel files

Hi Nicolas,
Try this one http://www.talendforge.org/forum/viewtopic.php?pid=42809#p42809
Hope you could get tips on that thread.

Re: Excel Dates - is there an auto-magic way of loading Excel files

thanks guys,
but the problem is rather more suttle
if i open the Excel file and 'format' the column with a custom format 'dd/mm/yyyy hh:mm:ss'
Talend will read this columns correctly and output:
19/07/2010 16:07:00
19/07/2010 17:31:00
19/07/2010 12:46:00
19/07/2010 18:38:00
19/07/2010 14:34:00
19/07/2010 17:55:00
19/07/2010 17:58:00
19/07/2010 21:09:00
19/07/2010 23:06:00
19/07/2010 19:01:00
19/07/2010 06:49:00
19/07/2010 04:48:00
19/07/2010 09:46:00

HOWEVER, if i format this with 'dd/MM/yyyy' in Excel, Talend now reads
19/07/2010
19/07/2010
19/07/2010
19/07/2010
19/07/2010
19/07/2010
19/07/2010
19/07/2010
19/07/2010
19/07/2010
19/07/2010
19/07/2010
any chance of having REAL datetime value of this column?
thanks
Community Manager

Re: Excel Dates - is there an auto-magic way of loading Excel files

Hi
If you don't know the format, read it as string on tFileInputExcel component, and then convert it to date if needed on tMap based on different date pattern.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business

Re: Excel Dates - is there an auto-magic way of loading Excel files

thanks shong
i am reading it as a string without any convertion/casting in Talend.
the formatting that i mentioned above is done in Excel.
this is why i am puzzled with this problem.
Talend seems to be reading the dates FORMATTED instead of the values held in the cells.
am i right in my assumption?
thanks

Re: Excel Dates - is there an auto-magic way of loading Excel files

viewing Excel sheets with Excel can be misleading. To simplify the problem, try exporting the file to csv and examining the date format in the resulting file.

Re: Excel Dates - is there an auto-magic way of loading Excel files

thanks John,
but i have these files uploaded to our system, and i am tying to find a way of processing it.
however, i can not open large number of files daily.
that is why i need to find a solution for converting dates.
Community Manager

Re: Excel Dates - is there an auto-magic way of loading Excel files

hi
Go to advanced setting and check box 'Don't validate the cells'.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business

Re: Excel Dates - is there an auto-magic way of loading Excel files

thanks for taking the time shong,
at first i though you were talking about a newer version of Talend then the one i have installed.
but maybe i have come across a bit of a bug on the UI
when i add my Excel connection, it does not show the 'Don't validate the cells' on the Advanced Tab.
it seems that when create an Excel input from the repository something does not work correctly.
*BUT* if i drag an excel input from the pallete it shows option on the advanced tab.
after i clicked every possible thing on this component - it seems that if you click on the 'use excel 2007' the un-click it.
it resets the Advanced Tab and it then shows all options as expenceted.
please see screenshot..
i will test the Excel conversion and report back
thanks,
Nicolas
Community Manager

Re: Excel Dates - is there an auto-magic way of loading Excel files

Hi
I am using TIS/TOS402. You can try the latest version.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business

Re: Excel Dates - is there an auto-magic way of loading Excel files

hi Shong,
i have used:
TOS 4.0.2 r43696
TOS 4.1.0.M2 r45078
on windows 7 x64
>java -version
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
in both installations the problem appears.
however, if you close the installation, this problem disapears on the existing Excel connection - BUT any new connection wil have the same problem.
with regards,
Nicolas

Re: Excel Dates - is there an auto-magic way of loading Excel files

problem persists

i have tried to go advanced setting and check box 'Don't validate the cells'
but it still gives me wrong dates.
is there anything that i can do?
run a VBA in VB.NET - it is either that or ditching the whole Talend project and restarting with SSIS.
sorry for keeping asking this but i am really stuck with this problem.
many thanks,
Nicolas


.--------------------------------------------+---------------------+---------.
| output |
|=-------------------------------------------+---------------------+--------=|
|filename |Arrival_Date_and_Time|myArrival|
|=-------------------------------------------+---------------------+--------=|
|Daily_Report_30_July_-_1_August_2010.xls|8/1/10 9:45 |null |
|Daily_Report_30_July_-_1_August_2010.xls|8/1/10 9:49 |null |
|Daily_Report_30_July_-_1_August_2010.xls|8/1/10 9:57 |null |
|Daily_Report_3_August_2010.xls |null |null |
|Daily_Report_3_August_2010.xls |03-Aug-2010 |null |
|Daily_Report_4_August_2010.xls |null |null |
|Daily_Report_4_August_2010.xls |04-Aug-2010 |null |

Re: Excel Dates - is there an auto-magic way of loading Excel files

anyone??
Six Stars

Re: Excel Dates - is there an auto-magic way of loading Excel files

Well, it must be said that Excel is not designed for consistency like a database, so if possible must be avoided as a data source for data processing because it usually a mess...
Anyway you can try accepting whatever Talend is giving to you and then parse it to a pure Date object with some custom code that guess the date string arrangement
....and.... if you want more help you should post the source excel file that you are using.
Four Stars

Re: Excel Dates - is there an auto-magic way of loading Excel files

I'm having the same issue. It seems that when the excel file is saved in Open Office (Linux Ubuntu) it saves the date format as
dd MM yy hh:mm:ss
however, if the excel file is saved in Windows, it saves it as
M/dd/yy
or
MM/dd/yy
not being consistent at all.
Any suggestions on fixing this?

Re: Excel Dates - is there an auto-magic way of loading Excel files

dont use excel to store or transport data. If you do, expect some pain. If possible, get your data in CSV format-- usually you just have to beg the accountants not to save in xls format Smiley Wink

if you have no choice, start by poking yourself in the eye. it will make the pain of dealing with excel seem less by comparison.
That said, The best approach is probably to read the date fields as strings, make them more consistent with a tJavaRow and then convert to Date.
you'll probably have to write some ugly logic to make guesses about what the format should be-- such as if( theDate.indexOf("/") ) to see if you have the "linux format" or the windows one. wont be pretty-- but working with excel rarely is Smiley Sad
Four Stars

Re: Excel Dates - is there an auto-magic way of loading Excel files

Thank you for your reply, and everything you said pretty much hit the nail in the head in my situation, and how I decided to resolve it. Writing the logic in code as oppose to reading it in the schema.
Unfortunately, in the finance world they sure love their excel...
Cheers!
One Star

Re: Excel Dates - is there an auto-magic way of loading Excel files

thanks for taking the time shong,
at first i though you were talking about a newer version of Talend then the one i have installed.
but maybe i have come across a bit of a bug on the UI
when i add my Excel connection, it does not show the 'Don't validate the cells' on the Advanced Tab.
it seems that when create an Excel input from the repository something does not work correctly.
*BUT* if i drag an excel input from the pallete it shows option on the advanced tab.
after i clicked every possible thing on this component - it seems that if you click on the 'use excel 2007' the un-click it.
it resets the Advanced Tab and it then shows all options as expenceted.
please see screenshot..
i will test the Excel conversion and report back
thanks,
Nicolas

check this vba samples
http://vb.net-informations.com/excel-2007/vb.net_excel_2007_tutorials.htm
Hope it will help you.
wills.
One Star

Re: Excel Dates - is there an auto-magic way of loading Excel files

Hello guys,
Regarding importing date from excel files, i just consulted an experience user which gave really useful hints as
follows:
Just add single inverted comma ' to make it text while it's in date format.
It then will have the date converted to date string instead of 4XXXX numbers.
And then u can import the excel / csv in whatever dd-mm-yy format it is.
Clement