Performance impact of tMSSqlOutput Commit Every and Batch Size

Seven Stars

Performance impact of tMSSqlOutput Commit Every and Batch Size

Bottom-line:  For the tMSSqlOutput component, the default "Commit every" and "Batch Size" of 10k provide good performance.

 

I had reason to investigate performance given some issues I was experiencing.  I created a simple job that read from one table and inserted the data into another table in the same database.  The table contains 463,122 rows with a single, varchar(50) primary clustered key, 50 columns, and an average record size of 1k bytes.   The database was idle with no operations other than my testing.  The job uses the "Open source JTDS" provider, with each component having its own connection (built-in) and no "Field Options".

 

I tried several combinations of sizes for "Commit" and "Batch" to see how they impact performance.  This was not a scientific study where I ran hundreds of tests and reconciled the results.  This was a seat-of-the-pants look at general behavior.  The tests did not account for network or other potential through-put issues.

 

Based upon the results, I concluded that "Batch" is the number of rows to be "batched up" before sending them to the database and "Commit" is the number of rows to be sent to the database before sending a commit.  (For those coming from Datastage, these appear comparable to "Array size" and "Transaction size".)

 

For this testing configuration, I found that "Batch" size should be definitely enabled (horrendous performance without it) and should be kept the same size as "Commit".  Setting "Batch" smaller than "Commit" significantly reduced performance compared to when the same as "Commit".  Setting "Batch" larger than "Commit" also reduced performance though not as much.

 

Screen Shot 2018-03-22 at 11.00.19 AM.png

Moderator

Re: Performance impact of tMSSqlOutput Commit Every and Batch Size

Hello,

Thanks for sharing your best practices on forum.

Best regards

Sabrina

 

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Seven Stars

Re: Performance impact of tMSSqlOutput Commit Every and Batch Size

use commit after the insertion process complete. this strategy will save huge amount of time.

Regards
Aashish
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.