We are in process of migrating our Datastage 8.7 version jobs to TALEND.
Is there any component similar to 'datasets' & 'hashfiles' in datastage available in TALEND.....?
You really need to give us a little more info on what "datasets" and "hashfiles" are and what they do with regard to Datastage. Chances are Talend will be able to handle the functionality they provide out of the box. If not, the major advantage of Talend is that you can write your own functionality (or include that of others) using Java.
Datasets are internal file formats in Datastage, which can be used as intermediate files for lookup and other operations and manage the data within the job. Moreover since the dataset files are in binary format, read / write to these files are very fast comparatively
For e.g, when u need to do a lookup from a large table, we can write the data to the dataset file and use it in other jobs rather than select the data from the DB again.
You can use tHashInput/Output components for this sort of thing in Talend. This will depend on memory though. If you want to store gigabytes of data in memory, you will need the memory on your machine. However it is very quick. There are other ways in which you can increase performance by removing the latency of db lookups, but tHash components are the first that come to mind. You also have to consider that with Talend you have every Java API available to you, so finding alternatives is easy if necessary
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks