From Thursday, July, 9, 3:00 PM Pacific,
our Community site will be in
read-only mode
through Sunday, July 12th.
Thank you for your patience.

tOracleInput component does not return rows when using the to_date () function in a Spark Job

Problem Description

A Talend Standard Job uses a tOracleInput component to execute the following SQL query:

"select * from actions where d_majenreg <= TO_DATE('2018-03-21 15:54:26' ,'yyyy-mm-dd HH24:MI:SS')
and d_majenreg > TO_DATE('2018-03-21 15:43:57' ,'yyyy-mm-dd HH24:MI:SS')",
rows are returned.
When executing the same SQL query in a Big Data Spark Job using a tOracleInput component, or a sqlplus tool, no rows are returned.

 

Root Cause

The query in the tOracleInput component within a Big Data Spark Job is not an Oracle SQL query, but a Spark SQL query that parses by Spark SQL framework. Moreover, the used the to_date () function in an Oracle SQL query has no equivalent in Spark SQL. Therefore, using the to_date() function gives unexpected results in a Talend Spark Job that uses a tOracleInput component.

 

Solution

Use Spark SQL functions, for example, from_unixtime and unix_timestamp, as shown below:

" select * from action as where d_majenreg <= from_unixtime(unix_timestamp('2018-03-21 15:54:26' ,'yyyy-MM-dd HH:mm:ss'))
and d_majenreg >  from_unixtime(unix_timestamp('2018-03-21 15:43:57','yyyy-MM-dd HH:mm:ss'))".
Version history
Revision #:
7 of 7
Last update:
‎02-28-2019 04:26 AM
Updated by:
 
Contributors