One Star

tMySQLOutput - dynamic database connectivity - best practice

Hello,
The export job that i am designing supposed to do the following:
- connect to Oracle db
- get list of reports to execute in a following form:
  -report name
  -target table name
  -dates range (from and to)
- create shared connection to MySQL (tmysqlconnection object)
- pass values from above to a subjob.
- the subjob will reuse oracle (source) and mysql (target) connections from a parent job
The simple example worked ok, but then we have started to add real life limitations:
1. Outputs may be split to a different MySQL schemas (e.g. one source table have records for different customers, and in a target MySQL we have separate database for each customer, meaning that inserts need to be done to different tables in different schemas). 
How can we reuse existing connection, and insert each time to different schema (for a simplicity schema name = customer name, so the data should be loaded to CUST1.PURCHASES or CUST2.PURCHASES
2. We do not want to have different job for each report type, and seeking for generic solution (a job that will take reportname as input and will take a query from repository(?), use appropriate schema and load into set of tables in a target db.
What is the best practice to store metadata for each export / report if we do not want to store it in the source database?
We work with Talend Platform for DM, so any advanced options can be used.
Thank you in advance for your advice.
GB.
7 REPLIES
Seventeen Stars

Re: tMySQLOutput - dynamic database connectivity - best practice

So far I would say every of your requests are possible.
Please do not expect I solve it for you but here some hints:
Reading the query from a database is simple. You have to read the select statement text into a context variable and set as query simply this variable in the tOracleInout component.
Using one connection for different databases in Oracle should work if you add the schema name in front of the table name.
Also using different MySQL database is not a problem because you can build a job which processes one report (not a particular) report and before doing it you can setup the target database with a context variable in the tMysqlConnection.

Now you probably get different columns for different queries. This can be achieved with the Dynamic datatype (only available in the Enterprise release). You can read everything from a query (it expects every result set column has a name matching to a column of your target table) and lead it to the target table. You can add more clear defined columns to the output schema as well (perhaps some metadata columns).
 
One Star

Re: tMySQLOutput - dynamic database connectivity - best practice

Thank you for a fast and clear reply Jan,
The problem is that i do not want to create excessive connections, ideally i would like to have one and reuse it (e.g. if i connect not using talend, i can run statements USE CUST1; INSERT into PRODUCTS and in next subjob iteration USE CUST10; INSERT INTO PRODUCTS. Unfortunately tMySQLOutput concatenates the database's schema to the statement.
 
Seventeen Stars

Re: tMySQLOutput - dynamic database connectivity - best practice

Ok we have to solve 2 issues:
1. reuse database connection: To do this add to the very fist job a tMysqlConnection and setup  as shared connection.
In every next job do the same (if a connection with the name already exists, it will be used instead of creating a new one).
2. The tMysqlOutput uses the database name from the connection and this is at the moment always the same (wrong).
The database name used by the other components is simply an entry in the globalMap. We can change it as we want. Right after every tMysqlConnection put a tJava connected with OnSubjobOk and here we change the database name as we need for this current job run:
Simply add this code (I assume the connection has the name tMysqlConnection_1):
globalMap.put("db_tMysqlConnection_1", context.my_current_database);
We spoke about set the database name in a context variable, here you use this name.
One Star

Re: tMySQLOutput - dynamic database connectivity - best practice

Thank you Jan,
I am afraid that this approach is not solves the problem of having only one generic connection - changing an attribute after the connection is created will not help. 
In any case, i've got some weird NullPointerException trying to implement it.
Seventeen Stars

Re: tMySQLOutput - dynamic database connectivity - best practice

Show me your job design (screenshot) and post the Exception stack trace. I am not convinced my suggestion will not work. Yes it is perhaps a bit difficult but should work.
One Star

Re: tMySQLOutput - dynamic database connectivity - best practice

You can also do the following to change the database schema at runtime.
((java.sql.Connection)globalMap.get("conn_tMysqlConnection_1")).setCatalog("database_2");
Seventeen Stars

Re: tMySQLOutput - dynamic database connectivity - best practice

Changing the catalog is not necessary if you use full qualified identifiers.