One Star

[resolved] Cannot read from file in bulk load windows version

Hello,
I am fairly good with Talend after how much I have used it over the past months with a vast variety of things.. But now I am stumped on reading from a file. I have looked at various forums online, both Java and Talend based, and have been unable to solve this issue.
So basically I am transferring data from one database to another via an un-load into a file first. For this un-load AND load I use the component tGreenplumOutputBulkExec component, which I have used many times before without problem on a linux based system.
One of the benefits of this component is that it both puts the data into the file and then immediately reads it back out for inserting into the database. This means that the file MUST exist! Assuming that the data input doesn't fail. Yet it tells me the file does not exist and therefore cannot read from it (to clarify the file is created successfully).
The file expressions I have tried and failed are as follows:
"X:/path/to/file.csv" <-- returns error - file does not exist.
"X:\\path\\to\\file.csv" <-- returns incorrect path (X: pathtofile.csv - no idea where backslashes are going...).
"X:\path\to\file.csv" <-- returns error - invalid escape sequence (but I know why at least).
Any help on this matter would be highly appreciated. It's a somewhat rather aggrevating bug!
Edit: It isn't access problems: I checked this by logging onto my linux machine and reading from the same file created via a mounted shared folder and it read it okay, so now I'm more lost..
Edit 2: I just tried the path "X:\\\\path\\\\to\\\\file.csv" and this reads back as "X:\path\to\file.csv" (finally found how to get a backslash to work!!) yet this still returns a 'file not found' error.
Edit 3: This is becoming more nightmarish haha! So, not sure why I even tried this, but I used a tFileArchive component to see if I could archive (any) of the files I've been creating as tests. It grabbed them all! I used the same file path copy pasted from the greenplum component; which itself still complains that the file doesn't exist. Anyone know what I could possibly be doing wrong?
Edit 4: Okay now I think I have stumbled across a bug, perhaps. Maybe I will issue this if someone from Talend can confirm theres no solution?.. So if I output to a tFileOutputDelimited, I can then read from that file with any other components, such as tFileInputDelimited and tFileArchive, and I can send that data to a normal output to my database (but this is slow and so I want to use bulk outputs). I copy-pasted the file path FROM the greenplum component and it reads the file: I copy it back - File does not exist.. What is going on? Smiley Frustrated
Edit 5: Just tried using the filepath for a test file under "/path/to/file.csv" (to clarify the file is under C:/file/to/path of course) and that too returned 'file does not exist' error (for the tGreenplumOutputBulkExec & tGreenplumBulkExec). I tried the same thing with a tFileInputDelimited and tFileArchive as a control environment and they worked correctly as before. I'm now becoming more certain these components are not working under a windows environment as I've tested them every way I can think of so far Smiley Sad
Edit 6: I've just tried a test using PSQL components as replacements (since Greenplum is PSQL under the covers) and the initial tPsqlOutput did work - So I was right on that bit. But when I tried using bulk components to read from files the 2 components returned 'file does not exist' errors again...
Edit 7: I have just very quickly re-produced this problem on a different windows pc having also needed do a full install of Talend. Using any bulk components reading from a file they give a 'file does not exist' error, no matter what type of file path syntax I use ( / , \ , \\ , \\\\).
Edit 8: Solved my absolutely stupid, and somewhat massively impressively accidental, coincidence. Please see my last post below if you wish to know what happened. Read all for the full affect of the story that unfolds within a day and a half of impressive stress testing work.. Smiley Happy
Just incase it is important - The first pc is a windows server 2008, the second is windows 7 professional, and the mac I was able to work these components on previously (and still) is a mac OS 10.X.
I confirm that the errors are occuring for tGreenplumBulkExec and tGreenplumOutputBulkExec (and as tested above the similar Psql components). Also I am still unable to use tGpload at all but thats a whole other topic.
Regards,
V Pem.
1 ACCEPTED SOLUTION

Accepted Solutions
One Star

Re: [resolved] Cannot read from file in bulk load windows version

Oh my god!! I am so sorry jlolling! I have been accidentally 'stupid' this whole time!...
So you said it doesn't work running a bulk from a local to a server - True, but I didn't realise for a fun reason.. I have, by PURE accident, mounted my Mac to the shared folder on the ETL server by the exact same directory name that the database server was also mapped to the shared folder by. So the local filepath to the file was the same as for the server, hence it worked...
But of course, I cannot map a windows PC to the /etldata folder.. It has to be a drive so it will not work the same way. I am so sorry for all of this! I overlooked something so simple and caused mayhem.. Thank you for all your input never-the-less this has been a great learning curve for me!
Very best regards,
V Pem.
18 REPLIES
One Star

Re: [resolved] Cannot read from file in bulk load windows version

Actually the first one was good, I suppose.
For all the Output that I use, that's the syntax i use !!
 So if you re sure of your path, the problem must be somewhere else.
One Star

Re: [resolved] Cannot read from file in bulk load windows version

Well thats what I thought - It should work and I found other posts where people say the same thing, but for me any component I use says the file doesn't exist, including the component that created the file initially.
It isn't access problems: I checked this by logging onto my linux machine and reading from the same file created and it read it okay, so now I'm more lost..
Seventeen Stars

Re: [resolved] Cannot read from file in bulk load windows version

You can always use the / instead of the \ . The JVM translate it into the necessary chars.
If you use a UNIX path under Windows (without drive letter) the drive C will be used.
One Star

Re: [resolved] Cannot read from file in bulk load windows version

I've already tried the '/' as standard, that returned a 'file doesn't exist' error unfortuantely. Also I am running files to an ETL server via a shared NTS folder which is, because of windows, mounted to its own drive (X in my case). Having said that, I will try a simple test on a file under C: and ignore the drive letter and see if that has any affect - Maybe the components are linux only (for whatever reason) and so cannot read the 'C:' part of the file path... I might as well test the entire durability of the component for Talend whilst I'm at it haha! Smiley Happy
Thanks for your input jlolling.
Regards,
Seventeen Stars

Re: [resolved] Cannot read from file in bulk load windows version

Talend does not have any Linux-only components at all.
I use this method for a number of customers without any problem. Only if you hand over a path to a Windows program you cannot use the / instead of the \ because native Windows programs are unable to translate it.
One Star

Re: [resolved] Cannot read from file in bulk load windows version

Just tried using the filepath for a test file under "/path/to/file.csv" (to clarify the file is under C:/file/to/path of course) and that too returned 'file does not exist' error (for the tGreenplumOutputBulkExec & tGreenplumBulkExec). I tried the same thing with a tFileInputDelimited and tFileArchive as a control environment and they worked correctly as before. I'm now becoming more certain these components are not working under a windows environment as I've tested them every way I can think of so far Smiley Sad
I've just tried a test using PSQL components as replacements (since Greenplum is PSQL under the covers) and the initial tPsqlOutput did work - So I was right on that bit. But when I tried using bulk components to read from files the 2 components returned 'file does not exist' errors again... Is it possible there is some sort of java control file that is not installed correctly with Talend or something? Smiley Frustrated
Regards,
Seventeen Stars

Re: [resolved] Cannot read from file in bulk load windows version

Thats what I told you. It does not work if you hand over such path to a native Windows program. Only Java programs can deal with it. Means you do that in all components written in pure Java. 
One Star

Re: [resolved] Cannot read from file in bulk load windows version

I have just very quickly re-produced this problem on a different windows pc having also needed do a full install of Talend. Using any bulk components reading from a file they give a 'file does not exist' error, no matter what type of file path syntax I use ( / , \ , \\ , \\\\).
Just incase it is important - The first pc is a windows server 2008, the second is windows 7 professional, and the mac I was able to work these components on previously (and still) is a mac OS 10.X.

Regards,
Seventeen Stars

Re: [resolved] Cannot read from file in bulk load windows version

Sure if you send parameters to a native Windows program as the bulk components under Windows usually do you have to use the original Windows path notation.
One Star

Re: [resolved] Cannot read from file in bulk load windows version

Which is what then? This is obviously where I am going wrong but I have tried everything including backslashes (windows notation '\') and forward slashes (UNIX notation '/'). Neither work.. Am I mis-understanding? I apologies if I am.
As a question to all: If I am trying to read from a file with a bulk component at the following shared folder, how would you type that filepath (exactly!) in the Talend bulk component?
X:\path\to\file\data.csv
One Star

Re: [resolved] Cannot read from file in bulk load windows version

Please read the post above this if you haven't to see if you can please answer my previous question...
I have now tried swapping out the directory path entirely for a directory context instead so that I can personally avoid all responsibilty for using bad notation - Still doesn't work..
If anyone could please, please, please help this would be amazing as this is now a critical issue for me!!
Seventeen Stars

Re: [resolved] Cannot read from file in bulk load windows version

Do you set this path a String directly in the component?
Anyway: you have to set it this way:
"X:\\path\\to\\file\\data.csv"
The reason is: the \ followed by a letter is a so called non-printable char or whitespace in Java (and a lot of other languages too).
if you write \\ it means you escape the \ -> you remove the actual meaning of the \ in the first place as introduction as escape sequence.
For an UNC path you have to do this:
"\\\\myhost\\my_share"
Please show me your job design if you have further questions and it would be great to know the exact error message.
One Star

Re: [resolved] Cannot read from file in bulk load windows version

The context I hit 'ctrl+spacebar' in the file name to select it. My path was then:
context.Directory+"file.csv"
If I do \\ both backslashes dissappear.. If I do \\\\ it gets replaced with \ (which is what I want), but still fails to find the file.
It is an NFS shared folder. From Computer, the drive name looks like this:
outputs\ (\\ip_address\directory1\directory2) (X: )
I've attached 2 files (only 1, for some odd reason the file fails).




I hope you can read stuff from these pictures. But generally you can see the filename on the left, and on the right you can see that the data was transferred correctly (initially, which means the file was created, but then it claims it does not exist after creating it).
I'll copy paste the error here directly for easier readability:
Exception in component tGreenplumOutputBulkExec_1_tPBE
org.postgresql.util.PSQLException: ERROR: could not open file "X:\directory1\directory2\file.csv" for reading: No such file or directory
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192)
    at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451)
    at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336)
    at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:328)
    at test.test_0_1.Test.tMSSqlInput_1Process(Test.java:2309)
    at test.test_0_1.Test.runJobInTOS(Test.java:2614)
    at test.test_0_1.Test.main(Test.java:2471)
Seventeen Stars

Re: [resolved] Cannot read from file in bulk load windows version

I do not think so it is an issue of the file path notation because the error message shows a valid file path.
You call the greenplum bulk process from your own computer but the file path must be valid from the process you call -> means on the database server ! The bulk importer runs for sure on a different machine and the file must be readable on this database machine.
 
One Star

Re: [resolved] Cannot read from file in bulk load windows version

Oh?? I've used it no problem on my Mac for over a month no problem, the server is in a different country from me. Any particular reason it does this only on windows? And if it really isn't meant to work how I've being using it, then what is the purpose of bulk components for servers since the file is created locally?
Seventeen Stars

Re: [resolved] Cannot read from file in bulk load windows version

I am not familiar with the Greenplum bulk importer, I work with the MySQL and Oracle stuff but usually such jobs only works on the database server it self because the importer needs to have access to the file on its own file system.
In your previous test scenario, the database was perhaps on your own machine I guess?
One Star

Re: [resolved] Cannot read from file in bulk load windows version

Oh my god!! I am so sorry jlolling! I have been accidentally 'stupid' this whole time!...
So you said it doesn't work running a bulk from a local to a server - True, but I didn't realise for a fun reason.. I have, by PURE accident, mounted my Mac to the shared folder on the ETL server by the exact same directory name that the database server was also mapped to the shared folder by. So the local filepath to the file was the same as for the server, hence it worked...
But of course, I cannot map a windows PC to the /etldata folder.. It has to be a drive so it will not work the same way. I am so sorry for all of this! I overlooked something so simple and caused mayhem.. Thank you for all your input never-the-less this has been a great learning curve for me!
Very best regards,
V Pem.
Seventeen Stars

Re: [resolved] Cannot read from file in bulk load windows version

You are welcome.