Bottom-line: For the tMSSqlOutput component, the default "Commit every" and "Batch Size" of 10k provide good performance.
I had reason to investigate performance given some issues I was experiencing. I created a simple job that read from one table and inserted the data into another table in the same database. The table contains 463,122 rows with a single, varchar(50) primary clustered key, 50 columns, and an average record size of 1k bytes. The database was idle with no operations other than my testing. The job uses the "Open source JTDS" provider, with each component having its own connection (built-in) and no "Field Options".
I tried several combinations of sizes for "Commit" and "Batch" to see how they impact performance. This was not a scientific study where I ran hundreds of tests and reconciled the results. This was a seat-of-the-pants look at general behavior. The tests did not account for network or other potential through-put issues.
Based upon the results, I concluded that "Batch" is the number of rows to be "batched up" before sending them to the database and "Commit" is the number of rows to be sent to the database before sending a commit. (For those coming from Datastage, these appear comparable to "Array size" and "Transaction size".)
For this testing configuration, I found that "Batch" size should be definitely enabled (horrendous performance without it) and should be kept the same size as "Commit". Setting "Batch" smaller than "Commit" significantly reduced performance compared to when the same as "Commit". Setting "Batch" larger than "Commit" also reduced performance though not as much.
Thanks for sharing your best practices on forum.
use commit after the insertion process complete. this strategy will save huge amount of time.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema
Pick up some tips and tricks with Context Variables