Four Stars

ETL vs ELT with Amazon Redshift

We are in the process of researching how best to use Talend for some of our ETL jobs.

 

So, our initial design is to load data from S3 to a "landing" schema in Amazon Redshift using the "COPY" commands - we will use tRedshiftRow for this.

 

Since we will use Redshift as both our source and target databases, is it possible to push a lot of the "transformation" work down to Redshift? I know we can execute custom SQL using the tRedshiftRow component. However, if we use, say the tMap component, is it possible to execute the output of this component in Redshift instead of in the Talend job server? The only reason we want to do this is to make use of the horsepower that's available to us in Amazon Redshift.

 

I would appreciate feedback.

  • Big Data
  • Data Integration
1 REPLY
Seven Stars

Re: ETL vs ELT with Amazon Redshift

As You describe - no, You can not.

 

If You use tMap - it mean - You make local for Talend transformations

You download (stream) data from Redshift to Talend, transform them and then upload (direct or to S3 for bulk loading)

 

You can check - where Talend Cloud locate really (i will be not surprise if on  AWS as well), or check other cloud ETL solutions, or Install Talend Server (or Jobs) on AWS if most of Your flow - Redshift to Redshift

 

And of course You can make some transformations on target Redshift instance - load, than transform depending what really You need from ETL or ELT process

-----------