One Star

How does tPostgresqlOutput Update work?

I am working on a Data Acquisition System and I only want to update the records that have changed. I noticed one of the settings for tPostgresqlOuput is to Update/Insert.
How does it compare the incoming records with the record at the source?
How does it know what to update? What's the where clause in the update statement?
Thanks for your response ahead of time.
6 REPLIES
Employee

Re: How does tPostgresqlOutput Update work?

How does it compare the incoming records with the record at the source?

As you didn't say if you're using a Java or a Perl project, I'll answer for the 2 generated languages. First in Perl.
There is not "update or insert" and "insert or update" data action for PostgreSQL because of the atomic transaction mode of this RDBMS. We only have "insert or update".
1. load a cache with all existing keys in the destination table
2. for each incoming line of the data flow, checks if the key is in the cache. Perform an insert if the key is not in the cache, an update is the key is already in the cache.
Now in Java, the algorithm is slightly different. For "insert or update", foreach input line:
1. perform a "select count(*) from table where id = <line_id>"
2. if count > 0, perform an update, else perform an insert
For "update or insert", foreach input line:
1. perform an update
2. if no line updated, perform an insert
Conclusion: Perl algorithm should be faster but memory consuming and dangerous in "concurrent mode". Java algorithm is slower (one more query per data flow line) but takes no memory and should be safer in "concurrent mode".
How does it know what to update? What's the where clause in the update statement?

This is the same for Perl and Java now.
By default, if a column of the schema is marked as "key", it is used in the where clause, else it's a column to update. The behaviour can be changed thanks to 2063 (in Perl, only for tMysqlOutput, add a feature request to have it in PostgreSQL) and 2759 (in Java, all existing database connectors). With the "field options", you can select with precision which columns are in "set" list (field is "updatable) and "where" clause (field is in the "update key"). As you can read in the bugtracker, the "field option" feature is available since 2.3.0 only (very recent feature).
One Star

Re: How does tPostgresqlOutput Update work?

Perfect. Thanks.
One Star

Re: How does tPostgresqlOutput Update work?

So is incremental load in same as update or insert talend
One Star

Re: How does tPostgresqlOutput Update work?

So is incremental load same as update or insert in talend
pls ans.
Seventeen Stars

Re: How does tPostgresqlOutput Update work?

In this post are a lot of wrong statements!
1. PostgreSQL does not perform atomic statements per default.
2. It is probably a very bad idea to load all keys into the memory to check if insert is possible! And I am pretty shure, perl wouldn't do this per default. By the way Perl is gone in Talend.
An incremental load is a design pattern and depends on the way how to extract the source data. It is not a matter of insert or update table action.
One Star

Re: How does tPostgresqlOutput Update work?

Hi ,
Is there a way I can avoid any update if there is nothing to update for a particular row?
Thanks