Six Stars

ELT Components with mutiple connections

I have a question regarding ELT and Talend.  Having worked with many ETL tools (Informatica, SAP Data Services, SSRS, etc), I have always tried to design jobs using as much ELT as possible to lighten the load on the job servers and push down as much logic to the database as possible.  I am currently working on a project where the client has Talend (subscription not open studio) with a MS SQL Server ODS.  I am having a bit of an issue with the ELT components and the different connections.  I have three different connections to SQL Server for the different databases and different schemas (all on the same server and instance, but the architect is very schema-happy).  All the db metadata is setup using context variables for the different environments.  Is it possible to use the ELT components with the different connections?  The current setup I have is a very common scenario.  I have a stage table in the staging database/schema which I filter based on incremental logic.  I also have my target table which I left outer to the staging to get my inserts/updates.  Unfortunately I only have the option to select one connection in the ELTMap component.  As a work-around to that limitation, I setup the flow to load a data transfer table (Staging table with filtered, incremental results creates a table in the same database/schema as the target, which I drop in the postjob flow), but I am getting errors because the ELT components don't fully qualify the tables with the schema specified in the connection.  I realize I can do this using my own query using standard DBInput/DBOutput components and use the globalMap to get the schema/database of the two connection to make it dynamic (or heck, even use an SP if SQL code is the only route), but this seems counter-intuitive for an ETL tool.  Am I missing something obvious?
Thanks,
Scott
FYI, we are using 6.3.1, which we have already found a bug in the tMSSqlOutput component, so it may be a bug in the tELTMSSqlMap component that is doesn't qualify the tables.  We ended up having to create a custom component for tMSSqlOutput using the 6.2.1 version of the same component.
3 REPLIES
Moderator

Re: ELT Components with mutiple connections

Hi,
 t<DB>ELT components are some what related to Extract-Load-Transform (ELT) method while t<DB>xxx is related to Extract-Transform-Load method (ETL).
The difference is that the operation of t<DB>ELT component is available inside a database.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Six Stars

Re: ELT Components with mutiple connections

Thanks Sabrina, these are the components I am referring to.
Okay, I managed to make it work with multiple connections, but the update SQL that is generated is hogwash.  It is generating the SQL like the following:
UPDATE TargetTableName
SET ColumnName = (
  SELECT StagingColumnName
   FROM Stage.StagingTable StagingAlias
      LEFT OUTER JOIN Target.TargetTableName TargetAlias
         ON TargetAlias.SourceId = StagingAlias.SourceId
   WHERE TargetAlias.PrimaryKey IS NOT NULL
      AND StagingAlias.SourceTimeStamp > ),
   NextColumnName = (
  SELECT NextStagingColumnName
   FROM Stage.StagingTable StagingAlias
      LEFT OUTER JOIN Target.TargetTableName TargetAlias
         ON TargetAlias.SourceId = StagingAlias.SourceId
   WHERE TargetAlias.PrimaryKey IS NOT NULL
      AND StagingAlias.SourceTimeStamp > ),

...etc.
What I would expect to see would be:
UPDATE Target.TargetTableName 
  SET ColumnName = StagingAlias.StagingColumnName,
      NextColumnName = StagingAlias.NextStagingColumnName,
  ...etc
FROM Staging.StagingTableName StagingAlias
 LEFT OUTER JOIN Target.TargetTableName TargetAlias
   ON TargetAlias.SourceId = StagingAlias.SourceId
WHERE TargetAlias.PrimaryKey IS NOT NULL
 AND StagingAlias.SourceTimestamp >

The current generated SQL (this is specifically an MS SQL component constantly generates failures because is try's to update a NULL value to the target table when there is a null constraint on that column.  The current WHERE clause brings back 0 records, so nothing should be inserted or updated.  I am at the point where I feel like I need to modify yet another 6.3.1 component.
Any suggestions?
Thanks,
Scott
Six Stars

Re: ELT Components with mutiple connections

I ended up modifying the existing component and creating a custom component.  The only component I modified was the ELTMSSqlOutput component, but in the future, in order to make this correction optional, I may need to touch the other two components.
If anyone is interested here are the two corrections I made to tELTMSSqlOutput_main.javajet:
// Line 285 - 304:  Two lines weere modified: line 288, the updatesSetStmt and line 303, the updateQuery variable creation
}else{
insertColName.append(colStmt.getName()+suffix);
insertValueStmt.append(colStmt.getValue()+suffix);
updateSetStmt.append(colStmt.getName()+"=\"+routines.system.StringUtils.splitSQLColumns(selectQueryColumnsName)+\"" +suffix);
}
}
counterOuter ++;
}

if(("INSERT").equals(dataAction)){
%>
String insertQuery = "INSERT INTO "+tableName_<%=cid%>+"(<%=insertColName.toString()%>)  ("+select_query+")";
pstmt_<%=cid %> = conn_<%=cid%>.prepareStatement(insertQuery);
<%
}else if (("UPDATE").equals(dataAction)){
%>
String updateQuery = "UPDATE "+tableName_<%=cid%>+" SET <%=updateSetStmt.toString()%> " + select_query.substring(select_query.indexOf("FROM"), select_query.length()).trim();
pstmt_<%=cid %> = conn_<%=cid%>.prepareStatement(updateQuery);

The other caveat was that in the properties for this component, "Default Schema Name" should be an empty String and "Use different table name" should be check and the name of the alias table used in the map should be used here.
Thanks,
Scott