One Star

Talend Performance vs Other Open Source ETL Tools

Hows does Talend compare to other open source ETL tools (performance, scalability, etc.)? Are there any performance test results or best practices guidelines available?
Thanks!
5 REPLIES
One Star

Re: Talend Performance vs Other Open Source ETL Tools

There is a report comparing Talend, Pentaho and CloverETL on TPC-H like test - see http://www.cloveretl.org/_upload/clover-etl/Comparison%20CloverETL%20vs%20Talend%20and%20Pentaho.pdf
Employee

Re: Talend Performance vs Other Open Source ETL Tools

Again a great example of a BenchMark comparison between ETL tools without best practice usage.
As I see, people have made this document don't know used Talend or Kettle.
Let me explain a little bit my opinion :
- Use case is only based on file datasource. When we would like to compare ETL tools, it's better to compare several data integration processus : files, databases, both datasources mixed in many data integration process.
- In Talend, depending of data volumes you have several ways to don't have JavaOutOfMemory in your process. In this case, you can activate that your large "lookup" will be on fileSystem (lookup is sorted and temporarly stored on your disk), this checkbox option is into the tMap and well documented in our UserGuide or ReferenceGuide documentation.
- We don't have any Job screenshot or tMap definition to be safe on the configuration that users have made.
We can't evaluate a solution when the benchmark is not fully clear or pertinent.
If somebody is reading this post and have capabilities to reproduce the use case, feel free to read the documentation for performance, and modify JobDesign to activate and apply Talend best practice to have better results.
Best regards
Seventeen Stars

Re: Talend Performance vs Other Open Source ETL Tools

hi,
There is a report comparing Talend, Pentaho and CloverETL

"futher capabilities" of each tool , comparison is made by CloverEtl "team"...
++
One Star

Re: Talend Performance vs Other Open Source ETL Tools

I never used CloverETL tools before, they seem to be interesting, but Talend has a wider offer in data integration/quality/management than any other open source vendor.
An ETL comparison Pentaho Kettle vs Talend Open Studio: http://www.robertomarchetto.com/www/talend_studio_vs_kettle_pentao_pdi_comparison

Re: Talend Performance vs Other Open Source ETL Tools

In some comparisons I've seen between Talend/Informatica we found between 10-90% decrease in run-time with the Talend jobs built v.s. the Informatica jobs that were migrated. This was on the same hardware sourcing the same data. Most of the jobs were database to database with a few files here and there. Talend jobs were exported as autonomous jobs and executed by an external scheduler Informatica was scheduled through workflow manager.
Overall, performance is a VERY difficult variable to measure because so many details can impact how fast your jobs run. In my opinion its nearly impossible to do a real world "apples to apples" comparison between ETL tools. The only way to truly compare in an "apples to apples" way is to build very simple jobs-- and in that case, generally the difference can be attributed to Java vs C++ libraries. When you're talking about performance, details matter.