TOS 5.2.1 Big Data Distribution Mapr THDFS

Highlighted
One Star

TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi everybody,
I've created a data integration job to connect to MapR distribution
I?ve two components, a tHDFSList and a TJava (to print the filename), you wan see it in the attached picture
I configured the tHDFSList and ran the job but here is my problem :
I think there is a memory leak? The java.exe process increase until 3Go and my computer crashes.
I try without value in the different parameters, it?s the same behavior.
I change the distribution and try with another : no problem, I have an error like ?server timeout?, but it?s normal.
I don?t understand the problem with the MapR distribution.
You can find the component screener in a attached picture.
Please help ?
For information, i had a first problem with Mapr, unsatisfiedlink error mapr client but i resolved it finding the solution in another topic (i had to instal a mapr client and declare the dll in the path environnement)
Regards,
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi
What OS and JVM are you using? I'd be surprised that this is a general issue, as it would have been most likely have been reported before.
The Doc highlights the need to have MAPR available on PATH
https://help.talend.com/search/all?query=tHiveClose&content-lang=en
Ciaran
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi,
Thanks Ciaran for your reply.
I'm using Win 7 and Java 1.7
I tried a Hive scenario (connection + input + close ) and it works fine.
But when i try with tHDFS (or tPigLoad) i have the same problem, increase of the allocated memory and crash.
Is it a problem with the MaprClient.dll ? I ask me that because I agree that it is a surprise that nobody reports the problem.
Thank you for your help
Regards,
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hi,
I just try to change the MaprClient.dll.
I have the memory leak with the MaprClient.dll form the mapr-client-2.0.0.
I download he MaprClient.dll form the mapr-client-2.1.2 and i have a new error :
Exception in thread "main" java.lang.Error: java.lang.Error: java.lang.UnsatisfiedLinkError: com.mapr.fs.MapRClient.initSpoofedUser(Ljava/lang/String;ILjava/lang/String;I)I

Is it a new problem with the version of the MaprClient.dll ??
Thank you

Regards,
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello,
The second error is a mismatch between your mapr client and the maprfs library. Talend doesn't support MapR 2.1.2 yet.
The first one comes from the dll as far as I know. Did you try to get help from MapR forums?
Regards,
Rémy.
One Star

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello Rémy,
Thanks for your reply.
I understand for the problem with version 2.1.2.
The memory leak with the MaprClient 2.0.0 is a MaprClient problem.
I download the 1.2.0 MaprCLient (compatible with Talend)
I don't have anymore the problem of the memory leak.
But it's still not ok ...
Now i have the error
"Some error on socket 940
2013-03-20 14:36:55,3887 ERROR Cidcache fs/client/fileclient/cc/cidcache.cc:1047 Thread: -2 Lookup of volume mapr.cluster.root failed, error Unknown error(108), CLDB: *********:8020 trying another CLDB
Some error on socket 940
2013-03-20 14:36:56,4768 ERROR Client fs/client/fileclient/cc/client.cc:226 Thread: -2 Failed to initialize client for cluster ****************:8020, error Unknown error(108)
Exception in component tHDFSList_1
java.io.IOException: Could not create FileClient
at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:192)
at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:202)
at com.mapr.fs.MapRFileSystem.listMapRStatus(MapRFileSystem.java:514)
at com.mapr.fs.MapRFileSystem.listStatus(MapRFileSystem.java:558)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:834)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:859)
at p_dtp_input.test_0_1.test.tHDFSList_1Process(test.java:443)
at p_dtp_input.test_0_1.test.tHDFSConnection_1Process(test.java:362)
at p_dtp_input.test_0_1.test.runJobInTOS(test.java:781)
at p_dtp_input.test_0_1.test.main(test.java:649) "

Regards,
Employee

Re: TOS 5.2.1 Big Data Distribution Mapr THDFS

Hello,
This error means that your CLDB is not started because of a volume issue. Is your CLDB started on the cluster side? Are you able to browse this following web page: http://CLDB_HOSTNAME:8443 ?
Rémy

15TH OCTOBER, COUNTY HALL, LONDON

Join us at the Community Lounge.

Register Now

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now