I have designed a job to Load multiple files from AWS S3 to Snowflake table using Bulk Load components.
My Flow is:
tDBOutputBulk has storage as "Internal" stage.
tDBROW has "Commit" command
There are total 2 files 450MB each on S3(total around 1GB data i.e 20 million records with 6 columns)
To load 1GB data, it is taking 25 min. I want to improve performance of my job.
Can anyone help in improving performance?
Also how to handle restartability in case of failure here?
I want to load data into snowflake using Talend Bulk components.
Any performance tips on my existing job design or any modifications?
Please let me know
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Move from On-Premises to the Cloud by following the advice of experts
Learn how to deploy Talend Jobs as Docker images to Amazon, Azure and Google Cloud registries
Read about some useful Context Variable ideas