Error: java.lang.OutOfMemoryError: Java heap space

Highlighted
Six Stars

Error: java.lang.OutOfMemoryError: Java heap space

Hi everybody.

 

Again, I REALLY need your help Woman Sad

 

I'm having the error "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space".

ErrorTalend.JPG

My OS is Windows 7 Professional Service Pack 1, 64 bits and my RAM 8GB. I'm using Talend Open Studio for Data Integration, Version: 7.0.1

 Windows.JPGTalend2.JPGTalend1.JPGJava.JPG

Right now, I'm trying to read an Excel file with 1 million rows and 34 columns aprox. full of data, using tFileInputExcel and a tLogRow, but my job only reads the first row (header) and then I get the error.

 Test.JPG

If I can read all the data (I hope you can help me with this), I'll process the information with components such as tMap, tAggregateRow, tPivotToColumnsDelimited, tFilterRow, tHashInput and tHashOutput, sending the entire results to a tFileOutputExcel & tFileOutputDelimited.

FullJob.JPG  My advanced settings are:

 -Xsm256M

 -Xmx2048m

 -XX:-UseGCOverheadLimit

 

Any suggestion? If somebody thinks that I could send you the file I'm trying to read (to ckeck if the problem is my computer), just tell me (it's an excel file .xlsx, 153MB)

 

Thanks!


Accepted Solutions
Eleven Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

did you try with Event mode in tFileexcelinput.

also please do load testing on job server and see if still giving error.

Regards
Abhishek KUMAR

All Replies
Fifteen Stars TRF
Fifteen Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

What if you try with -Xmx4096m?

TRF
Six Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hi!

I get the same error (after 24 minutes) Smiley Sad Another suggestion? Thanks!

ErrorTalend2.JPG

Fifteen Stars TRF
Fifteen Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

You can reduce the required memory space by replacing tHash components by files.

You can also store temporary data required by tMap components on disk.

For this,  click on the 3rd icone on the upper left corner of the tMap then indicate the "Temp data directory path" and the buffer size.

 


TRF
Forteen Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

and never use tLogRow connected to 1M rows source - it kills your studio!

-----------
Six Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hi @TRF 

The data stored in the tHash components isn't large and also, being information that results from the same job and then used as input, Talend apparently forces me to create metadata to reprocess the information with a tMap (and this could be a problem when I run the .bat file in another pc).

On the other hand, I tried to reduce the use of memory with your suggestion but I get the same error:

ErrorTalend_.JPG

My job should look like this and I pretty sure that my problem is reading the excel file with the source data (because it works just fine with an small amount of data in the excel file, by the way @vapukov I was using the tLogRow just to try the reading part, but you are right, it's not the best idea).

Please tell me that you have any other idea. Woman Frustrated

I dont know if this is crazy but maybe with Talend I can split the file and then read the data separately? Or what do you think I should try...

Thank you in advance!

FullJobFinal.JPG

Nine Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hi,

 

Change all sets of tAggregateRow into a tSortRow (sort by all group by criteria) and tAggregateSortedRow. Ensure you set the Use disk option on all components where possible (giving a more sensible buffer size of 100,000), including tMaps.

 

Also consider splitting it into 2 subjobs around the tFileOutputDelimited_3 (make the lookup a tFileInputDelimited of what tFileOutputDelimited_3 has just output).

 

 

Regards David
Dont forget to give Kudos when an answer is helpful or mark the answer as the solution.
Six Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hi @david_beaty . Thanks for your answer. I'm sorry to bother you, but before making the changes you suggest, I wanted to tell you that even without those components (tMap or tAggregateRow) the job generates the error. I've even tried to just read the Excel file, filter the columns I need (with tFilterColumns) and then filter the rows I need (with tFilterRow) to save this data to a new Excel file (for example), and the error also appeared. With this context, do you still think that I should replace the tAggregateRow with the components you mention? Thank you!

Eleven Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

did you try with Event mode in tFileexcelinput.

also please do load testing on job server and see if still giving error.

Regards
Abhishek KUMAR
Six Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hi @akumar2301

I am really grateful for your answer because so far it is the only way that the memory error has not appeared and the job has read the data from the source excel. I had not tried that alternative (because I did not know it existed), I did it and the reading part worked.

 

However, a new problem appears: the source file (excel .xlsx) has a date column in this format "14-10-2017 01:42:12" (dd-MM-yyyy hh: mm: ss), when I select in the tFileInputExcel the Event Mode, the job generates the error where it says that this data is not a date and forces me to change it to String (therefore the date becomes something like this "05: 15.3"). This column is very important for the calculations that I must do in the job because after some filters, I have to use the data of the date (the time is irrelevant) to calculate statistical information such as frequency and repetition. Is there any way that using the Event Mode that column where the date is can still be Date type? Thank you!

Eleven Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hello ,

 

I am not able to replicate.I see output same as it is in excel. It reads date with pattern "dd-MM-yyyy hh:mm:ss" also as a string.

Can you attach one sample input?

Regards
Abhishek KUMAR
Six Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hi @akumar2301 

I have attached a very small sample of the input file and how it looks once it passes through column and row filters (which I require in the job). The interesting thing as I mentioned before is the change of the content in the column 'EventTime' that corresponds to a date (select one of the cells in the input file so you can see how it is inside, not only in the preliminary view).

DateInput.JPG

Let me know if you can notice in the output file what I mean about the change when I have to configure the column as a string.

DateOutput.JPG

 

Thanks!

Eleven Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

It is because of custom formatting of that excel column. By Any chance , could you request to change the custom format in INPUT to dd/MM/yyyy hh:mm:ss" ?

Regards
Abhishek KUMAR
Six Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hi @akumar2301 

I had already tried, but I get this error:

Error_.JPG

Using this scheme (it's important that you know that using the User Mode, I had not had any error with this scheme):

Esquema.JPG

And previously setting this format to that column in the excel file:

FormatExcel.JPG

 

what do you think?

 

Thanks!

Eleven Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

with event mode , in schema date pattern put
dd/MM/yyyy HH:mm

excel custom pattern dd/mm/yyyy hh:mm ( as highlighted above)

it works for me.you date pattern (excel and Java) should be compatible.
Regards
Abhishek KUMAR
Six Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

@akumar2301I just did it (date pattern in Talend like "dd/MM/yyyy HH:mm", I tried it with quotation marks and without quotation marks, with the month and the hour in uppercase and in lowercase) and I'm getting the same error.

 

I know almost nothing about Java, but could it be that the JRE installed in Talend says it is 1.8.0_171 and in Java I have 1.8.0_191? And if it is this, what should I change?

JavaSettings.jpg

Eleven Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

@MayTorres attaching simple job which reads your sample input excel file in String and Date format ( after Custom format change )

Regards
Abhishek KUMAR
Six Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Well @akumar2301, this is embarrassing because your help has been amazing, but I tried your job with the file that I'm attaching here (after custom format change) and I still get the error. I really do not know what I can be doing wrong Woman Sad

 

Update: I just used the excel file 'Input' that you loaded with the job, and the same error comes out. I think definitely the problem is not the file but some unwanted behavior between Talend and Java when using the Event Mode.

Error_.JPG

Nine Stars

Re: Error: java.lang.OutOfMemoryError: Java heap space

Hi,

 

Open the job in STudio and click the "Code" tab at the bottom of the screen. You should see a red section highlighted where the right hand side slider bar is showing you where the problem is.

 

Thanks

Regards David
Dont forget to give Kudos when an answer is helpful or mark the answer as the solution.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch