In Analysis, difference between Java and SQL Engine

Highlighted
One Star

In Analysis, difference between Java and SQL Engine

Hi,
Can someone tell me what is the difference from Java and SQL Engine in Analysis ?
Thx in advance
One Star

Re: In Analysis, difference between Java and SQL Engine

The short answer is that Java executes where the Studio is installed while sql executes on the database. There will almost certainly be performance differences between the two methods.

If you supply a more detailed question I can supply a more detailed answer.
One Star

Re: In Analysis, difference between Java and SQL Engine

Hi Mike,
thank you for your reply.
As I made a very big analyse table in java I had a "Java Heap space" error. So I tested with SQL engine.
The same error occurs, so I wondered differences between them.
Sophie
Employee

Re: In Analysis, difference between Java and SQL Engine

As said West, the difference is that the java engine retrieves all rows in the java memory whereas the sql engine executes queries on the server and retrieves only the results or aggregated rows.

In the column analysis, this usually results in big performance differences, the sql engine being faster.
In the table analysis, not all rows are retrieved but only the distinct rows. If you still have a lot of distinct rows, you may encounter such memory issues.
If there is a primary key in the list of analyzed columns, then the java engine and the sql engine will just get as many rows. There is no difference here.

The other difference between java and sql engines is that only the java engine supports the use of regular expressions.

In order to avoid crashing the studio, some memory management options are available in the preference page https://help.talend.com/search/all?query=Defining+the+maximum+memory+size+threshold

When you have a lot of data, it is not recommended to check the "store data" checkbox in order to avoid storing all data in the analysis file.

This kind of analysis is usually recommended on data which have a lot of duplicate (the distinct rows data set should not be too large). That usually mean analyzing a subset of columns of a table. This analysis is not an analysis for finding duplicates.

For analyzing duplicates, a new type of analysis is provided in the 5.4 version of the studio.
Otherwise, in a 5.3 studio, you'll have to rather use the tMatchGroup component and its graphical wizard.
One Star

Re: In Analysis, difference between Java and SQL Engine

Thanks Sebastiao
very clear.
Sophie

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Introduction to Talend Open Studio for Data Quality

Find out about Talend Open Studio for Data Quality

Watch Now

Enabling Data Governance

Learn how to enable Data Governance

Watch Now

The Definitive Guide to Government Data Quality

Take a peek at the definitive guide to Government Data Quality

Read