I'm totally new to Talend. Just started learning.. Created few simple jobs to load data into snowflake tables just to have some hands on experience. My requirement is to load a huge data to a snowflake table. The table has 80million records and it has to be refreshed (truncate and load) in Snowflake everyday.
I have a couple of options..
1) One is to copy the data from source table (which is an Oracle) to target table (in snowflake) using tdbinput, tdboutputbulk and tdbbulk objects. I tried with this option and the seems to be very slow... How do i have multiple jobs to run in parallel so that it could be faster.
2) The 2nd option is that i have the entire data (80 million records) available in a CSV file. I believe the job will be much faster when using the CSV file rather accessing the Oracle table. Is there a way to have multiple job to read the CSV file in parallel?
Appreciate if someone provide inputs on this..
From the details you have given, my understanding is that the data fetch from Oracle is taking time. Why don't you do data fetch in parallel for different partitions and merge them later in the flow to Snowflake Bulk Output?
If you have 80 million records handy in csv file, you can think about using it as a source file to Snowflake Bulk Output Exec component. But you may still have to increase the memory of Talend job to maximum possible so that it will read more data chunks in one go.
Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
This video will show you how to run a job in Studio and then publish that job to Talend Cloud
This video will help someone new to using Talend Studio get started by connecting to Talend Cloud and fetching the Studio License
The Talend Cloud Developer Series was created to give you a solid foundational understanding of Talend’s Cloud Integration Platform