TOS 5.2.1 Big Data Distribution Mapr THDFS

Highlighted
One Star

TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi everybody,
I've created a data integration job to connect to MapR distribution
I?ve two components, a tHDFSList and a TJava (to print the filename), you wan see it in the attached picture
I configured the tHDFSList and ran the job but here is my problem :
I think there is a memory leak? The java.exe process increase until 3Go and my computer crashes.
I try without value in the different parameters, it?s the same behavior.
I change the distribution and try with another : no problem, I have an error like ?server timeout?, but it?s normal.
I don?t understand the problem with the MapR distribution.
You can find the component screener in a attached picture.
Please help ?
For information, i had a first problem with Mapr, unsatisfiedlink error mapr client but i resolved it finding the solution in another topic (i had to instal a mapr client and declare the dll in the path environnement)
Regards,
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi
What OS and JVM are you using? I'd be surprised that this is a general issue, as it would have been most likely have been reported before.
The Doc highlights the need to have MAPR available on PATH
https://help.talend.com/search/all?query=tHiveClose&content-lang=en
Ciaran
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi,
Thanks Ciaran for your reply.
I'm using Win 7 and Java 1.7
I tried a Hive scenario (connection + input + close ) and it works fine.
But when i try with tHDFS (or tPigLoad) i have the same problem, increase of the allocated memory and crash.
Is it a problem with the MaprClient.dll ? I ask me that because I agree that it is a surprise that nobody reports the problem.
Thank you for your help
Regards,
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi,
I just try to change the MaprClient.dll.
I have the memory leak with the MaprClient.dll form the mapr-client-2.0.0.
I download he MaprClient.dll form the mapr-client-2.1.2 and i have a new error :
Exception in thread "main" java.lang.Error: java.lang.Error: java.lang.UnsatisfiedLinkError: com.mapr.fs.MapRClient.initSpoofedUser(Ljava/lang/String;ILjava/lang/String;I)I

Is it a new problem with the version of the MaprClient.dll ??
Thank you

Regards,
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello,
The second error is a mismatch between your mapr client and the maprfs library. Talend doesn't support MapR 2.1.2 yet.
The first one comes from the dll as far as I know. Did you try to get help from MapR forums?
Regards,
Rémy.
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello Rémy,
Thanks for your reply.
I understand for the problem with version 2.1.2.
The memory leak with the MaprClient 2.0.0 is a MaprClient problem.
I download the 1.2.0 MaprCLient (compatible with Talend)
I don't have anymore the problem of the memory leak.
But it's still not ok ...
Now i have the error
"Some error on socket 940
2013-03-20 14:36:55,3887 ERROR Cidcache fs/client/fileclient/cc/cidcache.cc:1047 Thread: -2 Lookup of volume mapr.cluster.root failed, error Unknown error(108), CLDB: *********:8020 trying another CLDB
Some error on socket 940
2013-03-20 14:36:56,4768 ERROR Client fs/client/fileclient/cc/client.cc:226 Thread: -2 Failed to initialize client for cluster ****************:8020, error Unknown error(108)
Exception in component tHDFSList_1
java.io.IOException: Could not create FileClient
at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:192)
at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:202)
at com.mapr.fs.MapRFileSystem.listMapRStatus(MapRFileSystem.java:514)
at com.mapr.fs.MapRFileSystem.listStatus(MapRFileSystem.java:558)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:834)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:859)
at p_dtp_input.test_0_1.test.tHDFSList_1Process(test.java:443)
at p_dtp_input.test_0_1.test.tHDFSConnection_1Process(test.java:362)
at p_dtp_input.test_0_1.test.runJobInTOS(test.java:781)
at p_dtp_input.test_0_1.test.main(test.java:649) "

Regards,
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello,
This error means that your CLDB is not started because of a volume issue. Is your CLDB started on the cluster side? Are you able to browse this following web page: http://CLDB_HOSTNAME:8443 ?
Rémy

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.