Performance issues: ETL with Index rebuild vs Drop/Create

Highlighted
One Star

Performance issues: ETL with Index rebuild vs Drop/Create

Hello,
I have seen performance difference of about 10 times slower loads when I do a rebuild on indexes (disable before load, load, enable indexes after load) compared to dropping indexes -> load -> create indexes. This is weird as we would like to rebuild table indexes calling a package that has index metadata. This way index metadata is safely retained always.
ETL logic is pretty simple and should not effect the load and I have confirmed that by creating a duplicate copy of the same job with ETL loading data into a buffer out and then triggers bufferin -> target_table_load. Here again, ETl finishes loading into bufferout in 50secs, but reading from bufferin and loading to target goes extremely slow (10mins).
attached is a screenshot of entire load with 3 scenarios for comparision
1) drop & create table (no indexes)
2) rebuild index (disable,enable)
3) Create/drop index with truncate data
Community Manager

Re: Performance issues: ETL with Index rebuild vs Drop/Create

This is not a question about Talend, more of a question about whatever database it is you use. 
One Star

Re: Performance issues: ETL with Index rebuild vs Drop/Create

Using cursor (read 1000 rows) per set helped, as well as dropping index and creating after load reduced time greatly. I am guessing splitting data flow into parallel flows would help even more (tpartition) which I don't have access to even though we have enterprise licensed version of Taled for DI.

What’s New for Talend Spring ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Download