Seven Stars

Data Integration Talend studio ElasticSearch ?

Hello everyone

 

I am completely new to Talend, trying to learn the product, this is my first post here.

 

I have installed Data Integration Talend Open Studio version 6.4.1 on Windows 10 64-bit.

 

I need to get (i.e. read) data from ElasticSearch.

 

The Talend DI documentation PDF files which I downloaded do not mention ElasticSearch and there is no reference to tElasticSearch which I saw mentioned when I "googled" for a solution.

 

Can I use Talend DI (the "free version") to read/select/get/extract data from ElasticSearch ?

If so, could you give me some pointers to documentation or to other posts?

 

Thank you in advance

 

YuriB

Melbourne

Australia

 

 

Tags (2)
3 ACCEPTED SOLUTIONS

Accepted Solutions
Six Stars

Re: Data Integration Talend studio ElasticSearch ?

Hey YuriB, hope you're well.

 

I found some information on Google that this component is only available on Talend Open Studio for Big Data. Have you tried this one?

 

Thanks

Douglas

Employee

Re: Data Integration Talend studio ElasticSearch ?

Hi,

 

We have tElasticSearchInput and tElasticSearchOutput component in Big Data Spark Batch and Spark Streaming.  You need to use a paid subscription Big Data or Realtime Big Data edition to get these components.  We do not have ElasticSearch in traditional DI product.

 

 

Employee

Re: Data Integration Talend studio ElasticSearch ?

Hi,

Of course, you can potentially write this code and achieve connectivity with a Java API to ElasticSearch and Code Routines in Talend Studio.  Your Code Routine will wrap the ElasticSearch API.  

However, how complex will it be and how much time/effort it will take to write? I don't know.  And remember about Avro Schemas etc that you may need to handle.

 

If you need to process 5 -10 million rows per hour, then your best approach is to do this in Spark Batch/Streaming.  This way you can leverage the capability of a hadoop/spark cluster which can enable to meet this requirements.

 

10 REPLIES
Six Stars

Re: Data Integration Talend studio ElasticSearch ?

Hey YuriB, hope you're well.

 

I found some information on Google that this component is only available on Talend Open Studio for Big Data. Have you tried this one?

 

Thanks

Douglas

Employee

Re: Data Integration Talend studio ElasticSearch ?

Hi,

 

We have tElasticSearchInput and tElasticSearchOutput component in Big Data Spark Batch and Spark Streaming.  You need to use a paid subscription Big Data or Realtime Big Data edition to get these components.  We do not have ElasticSearch in traditional DI product.

 

 

Seven Stars

Re: Data Integration Talend studio ElasticSearch ?

many thanks for your replies!

 

I will investigate Talend Big Data and the costs.

 

As an alternative would the following work, as I only have one single requirement to read data from ElasticSearch (buying Big Data may be too expensive for just one requirement):

 

Possible Solution using "free" Talend DI:

(1) write a set of Java methods/classes which will read data from ElasticSearch and expose that data to the caller (which may be any Java program).

(2) call those custom ElasticSearch Java functions/methods/classes from "free" Talend DI job which will pass the data read into the DI job for further processing

 

Please let me know if this is realistic solution?

If so, would this be complex/time consuming to write?

Would this perform well enough for high volume (5-10 million rows per hour)?

 

many thanks again!

 

Employee

Re: Data Integration Talend studio ElasticSearch ?

Hi,

Of course, you can potentially write this code and achieve connectivity with a Java API to ElasticSearch and Code Routines in Talend Studio.  Your Code Routine will wrap the ElasticSearch API.  

However, how complex will it be and how much time/effort it will take to write? I don't know.  And remember about Avro Schemas etc that you may need to handle.

 

If you need to process 5 -10 million rows per hour, then your best approach is to do this in Spark Batch/Streaming.  This way you can leverage the capability of a hadoop/spark cluster which can enable to meet this requirements.

 

Employee

Re: Data Integration Talend studio ElasticSearch ?

Hi,

 

It is available in paid version only.  Not the opensource ones.

Seven Stars

Re: Data Integration Talend studio ElasticSearch ?

many thanks!

are you aware of any blogs/videos/white-papers showing me how to write and integrate my own Java code into talend DI job?
Employee

Re: Data Integration Talend studio ElasticSearch ?

There are many examples.  Look for tLibraryLoad, tJava, tJavaRow, tJavaFlex components example.  You can also use code routine.

Seven Stars

Re: Data Integration Talend studio ElasticSearch ?

thanks !!
Seven Stars

Re: Data Integration Talend studio ElasticSearch ?

what about using the Talend ESB Open Studio which is also "free" (?????) and use cMessagingEndpoint which has elasticsearch available in it  (Camel?) ??

Will this do the job and is this ESB "free" software ?

 

thanks

 

Seven Stars

Re: Data Integration Talend studio ElasticSearch ?

in "free" Talend DI there is Palette: ESB, REST, tRESTClient which appears to work.

 

I dont know if this solution will perform but I am getting data returned from ElasticSearch via tRESTClient into Talend DI and sending it all to tLogRow, just as a test.