We are in the process of researching how best to use Talend for some of our ETL jobs.
So, our initial design is to load data from S3 to a "landing" schema in Amazon Redshift using the "COPY" commands - we will use tRedshiftRow for this.
Since we will use Redshift as both our source and target databases, is it possible to push a lot of the "transformation" work down to Redshift? I know we can execute custom SQL using the tRedshiftRow component. However, if we use, say the tMap component, is it possible to execute the output of this component in Redshift instead of in the Talend job server? The only reason we want to do this is to make use of the horsepower that's available to us in Amazon Redshift.
I would appreciate feedback.
As You describe - no, You can not.
If You use tMap - it mean - You make local for Talend transformations
You download (stream) data from Redshift to Talend, transform them and then upload (direct or to S3 for bulk loading)
You can check - where Talend Cloud locate really (i will be not surprise if on AWS as well), or check other cloud ETL solutions, or Install Talend Server (or Jobs) on AWS if most of Your flow - Redshift to Redshift
And of course You can make some transformations on target Redshift instance - load, than transform depending what really You need from ETL or ELT process
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Read about OTTO's experiences with Big Data and Personalized Experiences
Pick up some tips and tricks with Context Variables
Take a look at this video about Talend Integration with Databricks