tHBaseInput - Incremental data read from Hbase

One Star

tHBaseInput - Incremental data read from Hbase

Problem Statement1: We are working on a project where we are supposed to read only the incremental data from Hbase table (Cloudera 4.6). A background process is running 24*7 and it is loading the data in Hbase table.
Approach: we are using a tHBaseInput component to read the data from the Hbase table, but we are not able to find any filter where we can provide the timestamp value so that it can read only the data which is loaded after the last run. 
I am not sure if i am missing something on the component or is it the limitation in talend tHBaseInput. I am using 5.4.2 Talend Big data.  
Problem Statement2: By default tHBaseInput uses scan to fetch the data form Hbase table and the cache size of this object is set to 1, which means the map-task will make call back to region-server for every record processed. Due to this the tHbaseInput is taking a lot of time to read from Hbase table (30 Mins for 1 Lakh records). We tried to do it in java by creating a new scan object and setting the cache size as 1000 and we were able to read 1 Lakh records in just 2 Minutes.
Do we have any properties in tHBaseInput where we can increase the default cache for scan. 
Community Manager

Re: tHBaseInput - Incremental data read from Hbase

1.) Is there a timestamp field in your table?
2.) Have a try to add the related property in the advanced settings panel of tHbaseInput component.
Best regards
Talend | Data Agility for Modern Business
One Star

Re: tHBaseInput - Incremental data read from Hbase

hemant056, have you got any solution for this?
I am facing same problem. I have input data in the form of epoch time. Urgent help needed.
One Star

Re: tHBaseInput - Incremental data read from Hbase

Hi Shong,
Can you please help me with this? I have timestamp field in my input database, which is epoch time in byte[].
Four Stars

Re: tHBaseInput - Incremental data read from Hbase

I have a similar requirement same as the above. Can I read the data from HBase table based on regions??

Calling Talend Open Studio Users

The first 100 community members completing the Open Studio survey win a $10 gift voucher.

Start the survey


Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences


Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now