update on tables...

One Star

update on tables...

Looking at the Talend operators I am not able to see a solution for updating a table. Suppose at the end of a job
I have a tDBOutput object. But, I do not want to insert rows, I want to update existing rows, or even better I want to let the operator to decide if a row should be inserted or updated, depending of a match between the primary key of the incoming row and primary keys on the target table...This is quite an important functionality in an ETL tool, since in datawarehousing applications nearly always you have to update existing data in a table...
What I would like to be able to do with Talend is to distinguish which incoming rows are new and which ones are already in the table (I mean the primary key is the same).
Employee

Re: update on tables...

raffaele, you're right, inserting rows is a native feature with Talend Open Studio 1.0.x but you have to play with the more "generic" DBSQLRow to manage table updates. What you ask is possible in release 1.0.0 with a tMap having 2 outputs: "create" (linked to a tDBOuput) and an "update" (linked to a tDBSQLRow).

Can you please create a feature request in the bugtracker (also our wish list manager)?

In your opinion, should the tDBOutput do all the work (deciding whether inserting or updating) or should we have a tDBInsert and a tDBUpdate?
Employee

Re: update on tables...

In my opinion, we can add a combobox called ACTION on the tDBOutput component.

ACTION properties will be "INSERT", "UPDATE", "INSERT OR UPDATE" or "UPDATE OR INSERT".
"INSERT OR UPDATE" means if the insertion failed, update operation will be poerform.

What do you think about this behavior ?
One Star

Re: update on tables...

ccarbone wrote:
In my opinion, we can add a combobox called ACTION on the tDBOutput component.

ACTION properties will be "INSERT", "UPDATE", "INSERT OR UPDATE" or "UPDATE OR INSERT".
"INSERT OR UPDATE" means if the insertion failed, update operation will be poerform.

What do you think about this behavior ?

Yes, this is exactly what should be done. This is the way other well known ETL tools like DataStage or Oracle Warehouse Builder work.
One Star

Re: update on tables...

Of course, you should also add the possibility to choose on which basis the tool should decide whether to insert or update rows. The primary key is the natural choice, but you could also choose other fields as logical keys. For example if you have a table whose primary key is simply a db sequence, this cannot be useful to choose whether to insert or update, you need a "business" key. In this case you should have a constraint, like a unique index, on a field, for example the name of a product in a product table, and the code should work on this field. I think it is not too difficult to recover the list of unique index constraints of a table and to build a combo to show in the tDBOutput...
One Star

Re: update on tables...

plegall wrote:
raffaele, you're right, inserting rows is a native feature with Talend Open Studio 1.0.x but you have to play with the more "generic" DBSQLRow to manage table updates. What you ask is possible in release 1.0.0 with a tMap having 2 outputs: "create" (linked to a tDBOuput) and an "update" (linked to a tDBSQLRow).

Can you please create a feature request in the bugtracker (also our wish list manager)?

In your opinion, should the tDBOutput do all the work (deciding whether inserting or updating) or should we have a tDBInsert and a tDBUpdate?

Sorry, but how can I do this? can you give me the coordinates of the bugtracker?
Thanks
One Star

Re: update on tables...

Hi!

Our Bugtracker can be found here : http://talendforge.org/bugs
(I think it is an item of our Community menu)

Thanks for your help!
Regards,
Fabrice
One Star

Re: update on tables...

What about the [Bug] 270 I opened on the bugtracker about this issue? When do you plan to add insert/update option on the tDBOutput component? I think that , after having added this feature and also after having added commit/rollback control on the tDBOutput component, your product will be mature enough to be employed in professional projects...
Employee

Re: update on tables...

[feature]270[/feature] is a priority before the end of the year and before 1.1.0. If you use Mysql as DBMS, another solution will be to use the new component tMysqlOutputBulk with the option REPLACE (the option is not implemented yet, but should be implemented for 1.1.0M4).

We did not forget your request :-) To make it easier for me to implement, I'll first dispatch tDB* in several component: one for Mysql, another for Oracle and so on. The component code will be much easier to write and to maintain.
One Star

Re: update on tables...

plegall wrote:
[feature]270[/feature] is a priority before the end of the year and before 1.1.0. If you use Mysql as DBMS, another solution will be to use the new component tMysqlOutputBulk with the option REPLACE (the option is not implemented yet, but should be implemented for 1.1.0M4).

We did not forget your request :-) To make it easier for me to implement, I'll first dispatch tDB* in several component: one for Mysql, another for Oracle and so on. The component code will be much easier to write and to maintain.

Good Idea, since every DBMS works diffferently.
Thanks
One Star

Re: update on tables...

rcafiero wrote:
plegall wrote:
[feature]270[/feature] is a priority before the end of the year and before 1.1.0. If you use Mysql as DBMS, another solution will be to use the new component tMysqlOutputBulk with the option REPLACE (the option is not implemented yet, but should be implemented for 1.1.0M4).

We did not forget your request :-) To make it easier for me to implement, I'll first dispatch tDB* in several component: one for Mysql, another for Oracle and so on. The component code will be much easier to write and to maintain.

Good Idea, since every DBMS works diffferently.
Thanks

What about the commit/rollback control in the tDBOutput? This also is quite an important feature...
One Star

Re: update on tables...

rcafiero wrote:
What about the commit/rollback control in the tDBOutput? This also is quite an important feature...

Regarding MySQL query logfiles:

061218 18:00:42       6 Connect     jboss_user@localhost on jboss
6 Query SET AUTOCOMMIT=0
6 Query set autocommit=1
7 Connect jboss_user@localhost on jboss_old
7 Query SET AUTOCOMMIT=0
7 Query set autocommit=1
7 Query select PK_IDPROFILE, NAME, SHORTNAME from t_
profile
7 Quit
6 Query INSERT
INTO t_profile...

... knowing
SET AUTOCOMMIT=0

config file parameter is instantly disabled by
set autocommit=1

sent by Talend, how can you manage a job that partially fails?

Rgds
One Star

Re: update on tables...

In my opinion, you should not only set the commit option in the tDBOutput component. You should open a new connection only if you have to connect to a different database schema of also a different DBMS, to avoid having too many opened connections for a single job, and if the connection has nocommit=1, you should not close the connection at the end of the tDBOutput component. Then, you should keep track of all the opened (nocommit=1, the connection with autocommit set can be closed immediately after query execution) connections and perform a global commit of all these connections as soon as you encounter a connection with autocommit set, and also at the end of the job, if everyting went allright. In case of errors, you will perform a rollback on all the opened connections. Of course, at the end of the job you must remember closing ALL opened connections. It is quite complicated, but not too much and mandatory in order to have a full control on the job activity.
One Star

Re: update on tables...

So what is the Talend's team recommendation about error management?

How can we build a job having several tDBOutput / update on tables managing that kind of scenario:
- the second (for example) tDBOutput fails (constraint error for example)
- the job has to release all datasets used as if there was no job execution
One Star tle
One Star

Re: update on tables...

I have some problems to use the action "INSERT OR UPDATE" with tDBOutput. I need some help to insert or update my table with tDBOutput using Postgresql.
With insert, it's OK , I have no problem ,but I want to update the record if it exists. I am not sure that I need tDBOuput and tDBSQLRow in TMap.
If somebody can tell me exactly how to do that.


Thanks.
Employee

Re: update on tables...

tle wrote:
[...] I need some help to insert or update my table with tDBOutput using Postgresql.
With insert, it's OK , I have no problem ,but I want to update the record if it exists. I am not sure that I need tDBOuput and tDBSQLRow in TMap.

With TOS 1.1.x, use tDBOutput with "insert or update" as "action on data". Be sure to have the right columns set as key in your schema.
One Star tle
One Star

Re: update on tables...

I have this message when executing

"can't execute update query"


My update query is below :

$query = '
UPDATE '.$desc_tDBOutput_1{dbtable}.'
SET programmeMod = ?, statut = ?, modification = ?, dateDebutValidite = ?, dateFinValidite = ?, utilisateurModification = ?, refHist = ?, idCapCCAVS = ?, idPeriodeFiscale = ?, montantCotisation = ?, revenuImposable = ?, revenuEstimeFRV = ?, allegementCAP = ?, dateReceptionMutation = ?, idAssociation = ?
WHERE idMembre = ?
';

and my insert query is below

$query = '
INSERT
INTO '.$desc_tDBOutput_1{dbtable}.'
('.join(',', @colnames).')
VALUES
('.join(',', map {'?'} (1 .. scalar @colnames)).')
';

I thing that my key for the shema is OK

How can I do in TMap for the correct query Update ?

Thanks again for your help.

tle

Re: update on tables...

Hi Tle,

I think you must check your key that you use to your update. On your Tmap, you must select the column that you use to your update, and it's not necessary the primary key.

For example, you a product table with three columns (Product number, that is a sequence; Product Code, and Product name).
Your primary key is defined on Product number and you have a unique index on Product Code.

On your Tmap, you must select the product code as key in order to do an update.

Best regards,
One Star tle
One Star

Re: update on tables...

It's what I thing, my key is all rigth.
I had my key for TDBOutput(Update) which is not my realy primary key (my realy primary key is the increment value) . My key for TDBOutput(Update) is the unique value that I want to compare for updating.
But I always have the same error message.

Re: update on tables...

Hi,

Can you paste a copy of your Tmap, in order to see what you do.
And can you tell us some rows which you have the problem.

Thanks in advance,

Best regards,
One Star tle
One Star

Re: update on tables...

It's my TDBInput1 & TDBInput2 configurations
One Star tle
One Star

Re: update on tables...

It's my TDBOUTput configuration
Employee

Re: update on tables...

tMap configuration screenshot is missing
One Star tle
One Star

Re: update on tables...

I can't post it , the size is 154ko but I dont't khnow why I cant't post it. Can I send you by e-mail


Thanks
One Star tle
One Star

Re: update on tables...

Hello,

I have try to make choice the compartment "Inner Join" at the top of the row1, but I have always the same message with action data "Update or Insert".
DBD:Smiley Tongueg::st execute failed: ERROR: column "idmembre" does not exist

can't execute update query

And with action data "Insert or Update" I don't have error message, but I have same rows inserted if it's executed more than once.

_My Join is : tDBInput_1.ucfirst(lc rtrim($row2[NOM])) = tDBInput_2.nom AND tDBInput_1.eval'$row2[AVS_INDIVIDUEL]' = tDBInput_2.numeroAVS
_"For each row I have in my tDBInput_1, I select all the rows of your tDBInput_2 to have the value of idPersonne for idMembre of my TDBOutput.
I have only one unique row of tDBInput2 for many row of tDBInput_1
_I want to "Insert or Update" only the rows that have a value for the field idPersonne

Thanks a lot.
Employee

Re: update on tables...

tle wrote:
I can't post it , the size is 154ko [...]

Compress with PNG instead of JPEG.
One Star tle
One Star

Re: update on tables...

It's my TMap configuration screenshot
Employee

Re: update on tables...

1. what kind of data do you have in $row2[AVS_INDIVIDUEL]? (why do you need to eval the expression?)

tle wrote:
For each row I have in my tDBInput_1, I select all the rows of your tDBInput_2 to have the value of idPersonne for idMembre of my TDBOutput.

"all the rows"? you should have only one row in tDBI2 for each row of tDBI1, if your dataset is correct.

tle wrote:
I want to "Insert or Update" only the rows that have a value for the field idPersonne

That should be the case if tDBI2 always provides an idPersonne. If not, you may add a filter condition on Output2: defined $row1[idPersonne]
One Star tle
One Star

Re: update on tables...

I) In $row2[AVS_INDIVIDUEL] I have numeric value and $row1[numeroAVS] is a string, and that what I need to eval for makink the join correctly. The join is correct, it can find my TDBInput2 to insert the correct value.

2)I have only one row in tDBInput2 for one or more row in tDBInput1(but when I mean "select all the rows of tDBInput2", I would like to say that it has to read all rows to find the unique one)

3)tDBInput2 always provides and idPersonne, and if not, the inner join has rejected it. Excuse me, but I don't understand why I need to add condition filter on OutPut2 ?

Thanks
Employee

Re: update on tables...

If tDBI2 always provide an idPersonne and if you use an inner join to fill Output2, then Output2 always have an idPersonne. Just plug a tDBOutput with "insert or update" action on data should be OK.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Have you checked out Talend’s 2019 Summer release yet?

Find out about Talend's 2019 Summer release

Blog

Talend Summer 2019 – What’s New?

Talend continues to revolutionize how businesses leverage speed and manage scale

Watch Now

6 Ways to Start Utilizing Machine Learning with Amazon We Services and Talend

Look at6 ways to start utilizing Machine Learning with Amazon We Services and Talend

Blog