it seems that Talend is not caplable of performing a general self join.
In my opinion this is a standard ETL procedure and I'm wondering why this is the case.
I'm aware, that there is a work around by selecting the data twice. This is performance-wise not optimal.
Other ETL-Tools support a simple self join.
Is there another work around or a planned change for this?
If your data is coming out of a database, why not do the self join in the database? If there's a good reason not to, and you really don't want to have to run a query twice, you can throw your data into a Hash and use two tHashInput components to read from the same source.
True. Using a Hash is valid work around.
The reason not running a query on the db are the internal project guidelines. The guidelines demand to use talend for all occuring transformations.
Still I don't understand why we need to use a work around for this "simple" task.
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.