Five Stars

Writing 260 columns to Excel

Hello,

 

I'm extracting 260 column data from SQL Server database and trying to write to an existing Excel file with headers.

 

I am using tFileOutputExcel component.

 

My Run Parameters are like this:

-xms256M

-xmx2048M

 

and in Windows -> Preferences ->Talend -> Run/Debug, I gave:

-XX: useGCOverheadlimit.

 

What else can I do to make this work? I have to write to an existing file since other tabs are filled with data from other processes.

 

Thanks,
Bee

 

The job fails with the error message:

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.sun.org.apache.xerces.internal.dom.AttributeMap.setNamedItem(AttributeMap.java:121)
at com.sun.org.apache.xerces.internal.dom.DeferredElementNSImpl.synchronizeData(DeferredElementNSImpl.java:138)
at com.sun.org.apache.xerces.internal.dom.ElementNSImpl.getNamespaceURI(ElementNSImpl.java:250)
at org.apache.xmlbeans.impl.store.Locale.loadNode(Locale.java:1420)
at org.apache.xmlbeans.impl.store.Locale.loadNodeChildren(Locale.java:1403)
at org.apache.xmlbeans.impl.store.Locale.loadNode(Locale.java:1445)
at org.apache.xmlbeans.impl.store.Locale.loadNodeChildren(Locale.java:1403)
at org.apache.xmlbeans.impl.store.Locale.loadNode(Locale.java:1445)
at org.apache.xmlbeans.impl.store.Locale.loadNodeChildren(Locale.java:1403)
at org.apache.xmlbeans.impl.store.Locale.loadNode(Locale.java:1445)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1385)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1370)
[statistics] disconnected
[statistics] disconnected
[statistics] disconnected
[statistics] disconnected
[statistics] disconnected
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:370)

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit 

Tags (1)
7 REPLIES
Employee

Re: Writing 260 columns to Excel

What part of the job is taking the most time and loading most data in memory?  Post a screenshot of you job

Five Stars

Re: Writing 260 columns to Excel

The job is very simple. Has a tMSSQLInput component which does a select <columns> from table and outputs to Excel.

 

 

Five Stars

Re: Writing 260 columns to Excel

Does it run out of memory if you overwrite the spreadsheet file instead of appending to an existing one?

Five Stars

Re: Writing 260 columns to Excel

Basically it's a memory issue, you need to increase your run time parameters to their highest extent that your computer's resources allow you to do so!
Five Stars

Re: Writing 260 columns to Excel

Pretty certain this will be down to the Excel output component loading the entire existing spreadsheet into memory in order to append the new rows.

 

The database input component will only be operating on relatively small batches of records (probably 1,000) at a time, so that wont be the problem, but whilst the basic file output components can just write their output directly to the file on disk, again requiring very little memory, Excel files, especially multi sheet ones, are much more structured than this, and lines can't just be appended to the end of the file, they need to be added within the body of the file, and that will be done in memory.

 

At a basic level, in the short term, increasing the memory may well help, but if this Excel file is going to keep getting bigger, then the problem is just going to rear its head again soon, and you'll then be out of options.

 

Excel files aren't particularly well suited to massive volumes of data, and if you're exhausting the 2Gb+ of memory your Java config is allowing, then this suggests a lot of data and/or a big spreadsheet.

 

If I were you, I'd be looking for an alternative to embedding the data in this spreadsheet. I'm sure if you were to provide a little more background to your requirements, the community members will be able to offer suggestions.

Five Stars

Re: Writing 260 columns to Excel

Thank you. The requirement is pretty straightforward and simple, except it is not.

 

I am executing a stored procedure which generates a very wide (260+ columns) dataset. This needs to be written to a pre-formatted excel sheet (containing 3 other sheets) template file which also contains the header.

 

A direct tMSSQL to tFileOutputExcel fails with memory issues. So trying to write to a csv hoping to push the data to excel.

But looks like there is no luck at all.

 

Any idea how I can do this?  Number of columns is 260. Number of rows is #2000.

Five Stars

Re: Writing 260 columns to Excel

If you're only outputting around 2,000 rows, then I'm very surprised that you're running out of memory.

 

How big (both in terms of overall file size, and number of rows/columns per sheet etc.) is the template spreadsheet which you're adding the data to?

 

Also, is it an XLSX file, as opposed to an old XLS format one?