From Thursday, July, 9, 3:00 PM Pacific,
our Community site will be in
read-only mode
through Sunday, July 12th.
Thank you for your patience.

tJDBCInput issue with large dataset

Six Stars

tJDBCInput issue with large dataset

Hi All,

I am quite new to Talend and just started using it. I have to copy data from several sources one of them is Postgres DB with SSL enabled. Initially the ready made DB connection for postgres didn't work for SSL and I came across suggestion to use General JDBC with few changes in JDBC URL, it worked. I am able to connect to Postgres DB and retrieve schema. There is one table with 5 million records, 35 columns. The requirement is to drop table on each refresh and recreate it.


I am getting below mentioned error 

[statistics] connecting to socket on port 3819
[statistics] connected
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.postgresql.core.v3.QueryExecutorImpl.processResults(
at org.postgresql.core.v3.QueryExecutorImpl.execute(
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(


I am using below mentioned setup 

  1. Created a DB connection using generic JBBC Connection to connect to Postgres DB
  2. Retrieve schema 
  3. Create a job tJDBCInput -> tOracleOutput

For tJDBCInput below are the setting 


Somewhere it was mentioned that MySQL and Postgres copies entire input data onto local disk before coping to destination and it was advised that  "Enable stream" is the MySQL equivalent of Postgres "Enable Cursor". Hence I used Enable Cusrsor




Run setting for the job.



This runs for a while without coping anything to destination and gives the Java heap space error. I am not sure what else I can do to copy data and make this work, please note that with all these setting if I limit the rows to 3 million everything works.


Few point to help you to help :-)

  1. XMS/X 4048 is the max I can allocate, don't have any more memory.
  2. Execution time is not a big concern as long as the job is robust and executes periodically
  3. Entire table with all the columns has to be copied, no exception of removing any column or rows
  4. Has to be done as one job and not in batches , table has data in such a way that there it almost not possible to uniquely identify each row using one column.  

Any help/suggestion would be a great help, thanks in advance. 



Re: tJDBCInput issue with large dataset


Have you tried to allocate more memory to your studio to see if it works? Please look at this article:TalendHelpCenter:Allocating more memory to Talend Studio.


Best regards



Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Six Stars

Re: tJDBCInput issue with large dataset

Hi Sabrina,


Thanks for the help. I tried the suggestion mentioned in the post and set the .ini files as recommended 




But I am still getting the error message 

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space


all the setting as same as mentioned in the original post. Somewhere I read that the data is first stored in the local memory and then copied to destination, is there a way not to store the data and just stream data from source to destination?



Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables


Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema


Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables