One Star

TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi everybody,
I've created a data integration job to connect to MapR distribution
I?ve two components, a tHDFSList and a TJava (to print the filename), you wan see it in the attached picture
I configured the tHDFSList and ran the job but here is my problem :
I think there is a memory leak? The java.exe process increase until 3Go and my computer crashes.
I try without value in the different parameters, it?s the same behavior.
I change the distribution and try with another : no problem, I have an error like ?server timeout?, but it?s normal.
I don?t understand the problem with the MapR distribution.
You can find the component screener in a attached picture.
Please help ?
For information, i had a first problem with Mapr, unsatisfiedlink error mapr client but i resolved it finding the solution in another topic (i had to instal a mapr client and declare the dll in the path environnement)
Regards,
6 REPLIES
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi
What OS and JVM are you using? I'd be surprised that this is a general issue, as it would have been most likely have been reported before.
The Doc highlights the need to have MAPR available on PATH
https://help.talend.com/search/all?query=tHiveClose&content-lang=en
Ciaran
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi,
Thanks Ciaran for your reply.
I'm using Win 7 and Java 1.7
I tried a Hive scenario (connection + input + close ) and it works fine.
But when i try with tHDFS (or tPigLoad) i have the same problem, increase of the allocated memory and crash.
Is it a problem with the MaprClient.dll ? I ask me that because I agree that it is a surprise that nobody reports the problem.
Thank you for your help
Regards,
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi,
I just try to change the MaprClient.dll.
I have the memory leak with the MaprClient.dll form the mapr-client-2.0.0.
I download he MaprClient.dll form the mapr-client-2.1.2 and i have a new error :
Exception in thread "main" java.lang.Error: java.lang.Error: java.lang.UnsatisfiedLinkError: com.mapr.fs.MapRClient.initSpoofedUser(Ljava/lang/String;ILjava/lang/String;I)I

Is it a new problem with the version of the MaprClient.dll ??
Thank you

Regards,
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello,
The second error is a mismatch between your mapr client and the maprfs library. Talend doesn't support MapR 2.1.2 yet.
The first one comes from the dll as far as I know. Did you try to get help from MapR forums?
Regards,
Rémy.
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello Rémy,
Thanks for your reply.
I understand for the problem with version 2.1.2.
The memory leak with the MaprClient 2.0.0 is a MaprClient problem.
I download the 1.2.0 MaprCLient (compatible with Talend)
I don't have anymore the problem of the memory leak.
But it's still not ok ...
Now i have the error
"Some error on socket 940
2013-03-20 14:36:55,3887 ERROR Cidcache fs/client/fileclient/cc/cidcache.cc:1047 Thread: -2 Lookup of volume mapr.cluster.root failed, error Unknown error(108), CLDB: *********:8020 trying another CLDB
Some error on socket 940
2013-03-20 14:36:56,4768 ERROR Client fs/client/fileclient/cc/client.cc:226 Thread: -2 Failed to initialize client for cluster ****************:8020, error Unknown error(108)
Exception in component tHDFSList_1
java.io.IOException: Could not create FileClient
at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:192)
at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:202)
at com.mapr.fs.MapRFileSystem.listMapRStatus(MapRFileSystem.java:514)
at com.mapr.fs.MapRFileSystem.listStatus(MapRFileSystem.java:558)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:834)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:859)
at p_dtp_input.test_0_1.test.tHDFSList_1Process(test.java:443)
at p_dtp_input.test_0_1.test.tHDFSConnection_1Process(test.java:362)
at p_dtp_input.test_0_1.test.runJobInTOS(test.java:781)
at p_dtp_input.test_0_1.test.main(test.java:649) "

Regards,
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello,
This error means that your CLDB is not started because of a volume issue. Is your CLDB started on the cluster side? Are you able to browse this following web page: http://CLDB_HOSTNAME:8443 ?
Rémy