Tips for optimizing load of 4GB file in MS SQL Server DB

One Star

Tips for optimizing load of 4GB file in MS SQL Server DB

Hi,
as said in the subject I have a 4 GB file (50 columns, 6 M lines) to load into an MS SQL table
due to security reason (server is reached through a vpn) I cannot use bulk or bcp
So I have decided to use a

Can you give me tips to optimize loading process ?
increase batch size ? Can you give me typical values ?
Enable Parallel execution ?

Regards

TIS PE 4.1.3 on Linux, 48 GB Ram, 16 cpu
Employee

Re: Tips for optimizing load of 4GB file in MS SQL Server DB

If it is possible you should use a bulk load to do this. If not, you can execute batches from a normal output in parallel and just increase the number of threads to 16 or some such number. Batch size is based on you network and DB. I know for Oracle we found the preferred batch size was 10,000 on normal tables.
One Star

Re: Tips for optimizing load of 4GB file in MS SQL Server DB

jandry,
thanks for your answer,
I've tryied with 8 threads and batch size of 10,000 and got the following error :
### Job STARTED at 2011/09/23 09:09:18 (jobId=20110923_090916_HdPqx, jobExecutionId=20110923090918_CuZha) ###
Exception in thread "Thread-3"
Exception in thread "Thread-2"
java.lang.OutOfMemoryError: Java heap space
at net.sourceforge.jtds.jdbc.ParamInfo.clone(ParamInfo.java:315)
at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.addBatch(JtdsPreparedStatement.java:532)
at crm.chargement_abo_0_1.chargement_ABO$1tAsyncIn_tMSSqlOutput_1_ParallelThread.run(chargement_ABO.java:5733)
Exception in thread "Thread-4" Exception in thread "Thread-5" java.lang.OutOfMemoryError: Java heap space
at net.sourceforge.jtds.jdbc.ParamInfo.clone(ParamInfo.java:315)
at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.addBatch(JtdsPreparedStatement.java:532)
at crm.chargement_abo_0_1.chargement_ABO$1tAsyncIn_tMSSqlOutput_1_ParallelThread.run(chargement_ABO.java:5733)
Exception in thread "Thread-1" java.lang.OutOfMemoryError: Java heap space
at net.sourceforge.jtds.jdbc.ParamInfo.clone(ParamInfo.java:315)
at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.addBatch(JtdsPreparedStatement.java:532)
at crm.chargement_abo_0_1.chargement_ABO$1tAsyncIn_tMSSqlOutput_1_ParallelThread.run(chargement_ABO.java:5733)
Exception in thread "main" java.lang.Error: java.lang.Error: java.lang.Error: java.lang.Error: java.lang.OutOfMemoryError: Java heap space
at crm.chargement_abo_0_1.chargement_ABO.tChronometerStart_1Process(chargement_ABO.java:540)
at crm.chargement_abo_0_1.chargement_ABO.runJobInTOS(chargement_ABO.java:8378)
at crm.chargement_abo_0_1.chargement_ABO.main(chargement_ABO.java:8257)
Caused by: java.lang.Error: java.lang.Error: java.lang.Error: java.lang.OutOfMemoryError: Java heap space
at crm.chargement_abo_0_1.chargement_ABO.tMSSqlConnection_2Process(chargement_ABO.java:672)
at crm.chargement_abo_0_1.chargement_ABO.tChronometerStart_1Process(chargement_ABO.java:519)
... 2 more
Caused by: java.lang.Error: java.lang.Error: java.lang.OutOfMemoryError: Java heap space
at crm.chargement_abo_0_1.chargement_ABO.tMSSqlRow_1Process(chargement_ABO.java:798)
at crm.chargement_abo_0_1.chargement_ABO.tMSSqlConnection_2Process(chargement_ABO.java:651)
... 3 more
Caused by: java.lang.Error: java.lang.OutOfMemoryError: Java heap space
at crm.chargement_abo_0_1.chargement_ABO.tFileList_1Process(chargement_ABO.java:4211)
at crm.chargement_abo_0_1.chargement_ABO.tMSSqlRow_1Process(chargement_ABO.java:777)
... 4 more
Caused by: java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
### Job ENDED BY USER at 2011/09/23 09:15:01 (jobId=20110923_090916_HdPqx, jobExecutionId=20110923090918_CuZha) ###

Then with 4 threads :
Exception in thread "Thread-6" Exception in thread "Thread-1" Exception in thread "Thread-3" java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
at net.sourceforge.jtds.jdbc.ParamInfo.clone(ParamInfo.java:315)
at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.addBatch(JtdsPreparedStatement.java:532)
at crm.chargement_abo_0_1.chargement_ABO$1tAsyncIn_tMSSqlOutput_1_ParallelThread.run(chargement_ABO.java:5733)
Exception in thread "Thread-4" java.lang.OutOfMemoryError: GC overhead limit exceeded
Exception in thread "main" java.lang.Error: java.lang.Error: java.lang.Error: java.lang.Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
at crm.chargement_abo_0_1.chargement_ABO.tChronometerStart_1Process(chargement_ABO.java:540)
at crm.chargement_abo_0_1.chargement_ABO.runJobInTOS(chargement_ABO.java:8378)
at crm.chargement_abo_0_1.chargement_ABO.main(chargement_ABO.java:8257)
Caused by: java.lang.Error: java.lang.Error: java.lang.Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
at crm.chargement_abo_0_1.chargement_ABO.tMSSqlConnection_2Process(chargement_ABO.java:672)
at crm.chargement_abo_0_1.chargement_ABO.tChronometerStart_1Process(chargement_ABO.java:519)
... 2 more
Caused by: java.lang.Error: java.lang.Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
at crm.chargement_abo_0_1.chargement_ABO.tMSSqlRow_1Process(chargement_ABO.java:798)
at crm.chargement_abo_0_1.chargement_ABO.tMSSqlConnection_2Process(chargement_ABO.java:651)
... 3 more
Caused by: java.lang.Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
at crm.chargement_abo_0_1.chargement_ABO.tFileList_1Process(chargement_ABO.java:4211)
at crm.chargement_abo_0_1.chargement_ABO.tMSSqlRow_1Process(chargement_ABO.java:777)
... 4 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded

It finally worked with 3 threads.
According to the configuration of the server (48 GB RAM, 16 cpu) I was hopping more threads. Do I need to increase so memory settings ?
Regards