TPC-H benchmark data for Talend big data 5.6 or 5.5.1

TPC-H benchmark data for Talend big data 5.6 or 5.5.1

Hello,
Is it possible to get TPC-H benchmark results for Talend Big Data 5.6 or 5.5.1? All I have been able to find is 5.6's TPC-H benchmark performance is improved by 24% over the previous release (presumably 5.5.1). Actuals from benchmark execution would help my evaluation.
Thanks,
Gautam.
Employee

Re: TPC-H benchmark data for Talend big data 5.6 or 5.5.1

Hello!  We don't typically publish absolute query times, because they're easily misused and/or misleading.
Especially for a benchmark suite like TPC-H, performance can vary dramatically depending on the configuration/size of the cluster and the input data characteristics.  This is especially true since the TPC-H queries are not typical big data queries (being oriented towards normalized, structured, relational style queries).  For our purposes, we treat each incoming table as a flat, unindexed text file, which is actually quite good for validating performance and correctness of the generated jobs.
It wouldn't be accurate, however, to take the query performance on our cluster and compare it with (a) a technology that isn't using the same "flat, unindexed text file" assumption or (b) any big data tech running on a differently configured cluster.  The fairest way to use this data is to measure relative performance in a controlled environment.
Our methodology is to launch jobs generated from the last two versions of the studio (5.5.1 and 5.6) on the exact same cluster and identical input data.  Each job is run several times and the outliers are discarded.  The jobs are migrated without any design changes, since the goal is to measure improvements due to code generation changes.
I hope that you find this useful, even if it's not the specific query data that you're asking for.  We're quite proud of the continual performance improvements in our code generation, and there's some good ideas coming up.  If you have any specific questions about our performance benchmarking against other tech, please don't hesitate to ask!
Ryan

Re: TPC-H benchmark data for Talend big data 5.6 or 5.5.1

Hello,
I appreciate your feedback. Knowing relative performance gains without underlying hardware and software configuration does not help much, unfortunately. Would it be possible to provide benchmarking results against other tech with hardware and software configuration?
Thanks,
Gautam.