I have a few large tables in snowflake to which I would like to only do incremental load based on max date.
My goal is to get the max date from tables in snowflake, bring that back on premise or may be store them in global variable and iterate over it (which I am not sure if it is possible). Because there is no list transformation for snowflake like tMSSQLTableList.
For example if in table A - max date for StartDate is 2018-10-11 then I would like to only get the records from Table A where StartDate>2018-10-11.
What would be the most efficient way to accomplish this? I am guessing I can store the table name and the max date in a global variable - but can iterate over it? or bring that data back on -premise in a table form that looks like this
Just trying to figure if there is a way where I can utilize the max dates in one flow - instead of getting max date for each table separately in separate flows.
You can check tDBRow (Snowflake)
and information from information schema - https://docs.snowflake.net/manuals/sql-reference/info-schema/tables.html and https://docs.snowflake.net/manuals/sql-reference/info-schema/columns.html
then make a loop over tables with dates columns
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema
Pick up some tips and tricks with Context Variables