'Out of Memory' error when reading large Excel files in Talend Studio

Problem Description

When using a tFileInputExcel component to read Excel files in different formats such as .xls and .xlsx, the Job fails with an OOM (Out Of Memory) error.

XLSFailure.PNG

 

Root Cause

This error usually occurs because of the high memory consumption used when reading large Excel files.

 

Solution

 

Handling .XLS files

When the input Excel files are in .XLS format, assess the memory consumed when running the Job in Studio. Based on this input, set the JVM parameters in Studio as well as in TAC at Job level.

 

In this case, the input Excel file has around 1 million records. To successfully execute the Job and process records, update the -Xmx parameters to 8 GB. Observe and monitor the memory consumption in Task Manager to ensure that 8 GB meets your requirements.

JVM.PNG

 

Handling .XLSX files

When the input Excel files are in .XLSX format, configure the tFileInputExcel component as follows:

  1. In the Basic settings view, enable the Read excel2007 File Format(xlsx) check box.

  2. Select the Advanced setting tab.

  3. In the Generation mode drop-down menu, select Less memory consumed for large Excel(Event mode).

    Prop1XLSX.PNG

    generationmode.PNG

Version history
Revision #:
4 of 4
Last update:
‎12-13-2019 07:13 AM
Updated by:
 
Contributors