Empty and Repopulate HashTable

One Star

Empty and Repopulate HashTable

I'm developing a REST web service to build JSON output from a few sources. As part of this process I have two database tables that I'm reading into separate hashtables for use throughout the job, so that each request on the web service queries data in RAM rather than consuming resource on the database.
I wish to refresh these hashtables with 'live' information from the database every few hours. I'm currently doing this by creating a context variable that is timestamped when the service is first loaded, then queried on each REST request. If the query identifies that the timestamp is older than the preconfigured number of hours, then I want to rebuild the hashtables.
This process works fine, however when the hashtables are repopulated from the database query, it appends the hashtable records so I end up with duplicate data. The 'append' box is *not* checked on the HashOutput used at the start of the job, however the HashOutput used for the rebuild is different and 'linked' to the former HashOutput; reading Talend documentation suggests that all linked outputs are forced to append.
I've tried using a HashInput to read the table prior to rebuilding it, and enabled 'Clear cache after reading'. This clears the hashtable fine, however as I'm trying to subsequently write data to a linked HashOutput I receive a 'hashmap not initialized error'.
Just wondering if anyone has tried to use hashmaps in this context previously, and has any pointers?

Community Manager

Re: Empty and Repopulate HashTable

After checking that your tHash data is out of date, read it (using the tHashInput) with "clear cache after reading" set. Then you can load fresh data. 
One Star

Re: Empty and Repopulate HashTable

I've tried reading both hashtables with tHashInput components, and checking the 'Clear cache after reading box'. This successfully clears the hashtable, however when the sub-jobs run to repopulate it I receive the following:
Exception in component tHashOutput_6
I believe this could be because I'm trying to update a linked HashOutput rather than the 'master' HashOutput. However, I can't see a way to update the 'master', as this would create a loop in my job.
Community Manager

Re: Empty and Repopulate HashTable

Ah, stupid me, I didn't read the full post. Looking at your service, I am not entirely sure that this will work with the tHash components. You would have to re-initialise the tHash component, but that can only happen once in a job. 
An alternative to this is to use a bit of Java. Use a tJavaFlex and create your HashMap/ArrayList/whatever Java type is suitable. Create it at the beginning using the same sort Service design as you have here, read it from your created Java object using a tJavaFlex, and when you want to update it, you can manually reinitialise your object.
Obviously it requires a bit of Java, but would get you round the limitations of the tHash components.
I have written a tutorial on something completely unrelated, but it contains an example of how to store data in your own Java data structure and read it back using a tJavaFlex. You can see it here (http://rilhia.com/tutorials/talend-connect-example).


Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables


Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema


Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables