One Star

Dynamically selecting schema in DQ Column Analysis

We have several database schemas that contain tables with the exact same columns, data types, etc. in each schema.  Let's say I create a Column Analysis for the CUSTOMER table in SchemaA.  I want to create a Talend job that runs the same column analyses on SchemaB, SchemaC, and SchemaD.  How do I do this with context variables?
When I run the analysis, the SQL looks something like this:
select count(*) from SchemaA.Customer where ColumnA is null
It looks like it's hardcoding the schema name in the generated SQL.  How do we work around that to make it dynamic?
3 REPLIES
Seventeen Stars

Re: Dynamically selecting schema in DQ Column Analysis

You could put the schema into a context variable and call the same job with handover the schema as context variable.
Let say you define in your job a context variable called database_schema.
Your query should looks like:
At the beginning of the job or for the context parameters in a tRunJob you set the context variable with the correct schema.
select count(*) from " + context.database_schema + ".Customer where ColumnA is null

Actually this way you could also change column names.
One Star

Re: Dynamically selecting schema in DQ Column Analysis

Makes sense, but will this approach work when we create a job that uses tDQReportRun component to call a DQ Column Analysis? When we create a Column Analysis, we must select a DB to choose the columns to analyze.  If we select SchemaA as our database, will this same job work if we have a context variable for schema and we run it in other schemas besides SchemaA?
Seventeen Stars

Re: Dynamically selecting schema in DQ Column Analysis

I do not know, I have not used the component tDQReportRun so far. But the typical configuration of a database is via context variables and I would be surprised if the data quality stuff use other methods to configure the database.