Employee

Limit 30,000 rows: our response

We get this question/feedback a lot: why does the tool limit to the top 30,000 rows, or: 30,000 is not enough.
Data Prep Free Desktop loads the entire dataset in memory. 30K is not a hard limit, just a safeguard to stay beyond acceptable response times for the average hardware. As more high-end hardware can handle more rows, and because 30K may be too little for a file or too many for another, in an upcoming upgrade there will be a UI control to let you increase this limit as you see fit.
In the meantime you can play trial & error by changing this arbitrary limit in a config file located here on Windows: \config\application.properties. Just edit the number in your favorite text editor. Sorry Apple users (including yours truly) the similar file on OS X is not as easily editable.
The commercial add-on due in June will feature more sophisticated techniques and scale with large files.

  • Dataprep
12 REPLIES
One Star

Re: Limit 30,000 rows: our response

You should consider memory mapped files or page file backed, where memory is handled by the operating system leveraging a well tuned subsystem to handle such cases. Possible in java with NIO.
said that, 10000 is really too low.
I like the tool, but also with 10000 limit is kind of slow - hope for improvement in next versions. thanks
Employee

Re: Limit 30,000 rows: our response

In the commercial version in June we have invested in more sophisticated techniques and you will not be limited to the available memory. Our direction is in fact far beyond paging, memory-mapped or NIO. Our direction is unlimited scalability with Spark and Hadoop (we already know this stuff see here).
As far as the Free Desktop version is concerned though, what would be a reasonable limit in your opinion? Other forum participants feel free to cast your vote ;-)
One Star

Re: Limit 30,000 rows: our response

My organization just purchased licenses for Talend Data Integration a few months ago.  I was very excited when the Talend Data Preparation module became available.  I do understand that it is free, so beggars can't be choosers, but 10,000 rows is too low.  We plan to use this module when it is no longer free, and I'd like to use it now for building "recipes", but can't because of the 10,000 row limit.  It's causing me to use other tools and work arounds until the row limit is increased. 
I am a Mac user and would be interested if there is a way to increase it.  50,000 or 100,000 rows would be helpful even if the performance was slowed.
Employee

Re: Limit 30,000 rows: our response

In the next upgrade(s) you will have full control right from within the UI to increase the 10K limit -- at your own risk.
I am not even sure the config trick is possible on Mac, because the JVM on OS X obliges us to wrap the config file into a Jar (perhaps to avoid too easy injection of a file on the class path?). You could unjar it, make the change and rejar. But the app is signed for Gatekeeper, so if you tamper with the application, it might complain and keep you from running the app as a security measure.
In the screen capture below you can see where the config file is within the .app bundle.
One Star

Re: Limit 30,000 rows: our response

When will the next upgrade be available?
Also, in the current version, how do you increase the row limit for a Windows machine?
Thanks in advance
Employee

Re: Limit 30,000 rows: our response

Hi
We will release quarterly. We looking an update in number of weeks.
Did you try to edit the following file on Windows?
File located here on Windows: \config\application.properties. 
Just edit the number in your favorite text editor.
One Star

Re: Limit 30,000 rows: our response

I did test it and was giving mixed results in 1.0.
But 1.0.1 is much faster and it works OK even with 100,000 lines.
One Star

Re: Limit 30,000 rows: our response

In the current version 1.1.0, I'm only able to download 3000 records rather than 100 000. Has this setting been changed or can I change a setting on my computer?
One Star

Re: Limit 30,000 rows: our response

I have version 1.1 and have edited the file mentioned above and I still am limited to 30,000 records. How an I remove this limit? It this is the limit I will need to do my profiling directly in excel.
One Star

Re: Limit 30,000 rows: our response

Hi,
i'm using Talend-DataPreparation-Free-Desktop-windows-1.3.0 and the limit to 30000 still remain despite of modifying the application.properties file.
Any idea of how to fix it?
Thanks for your help.
Four Stars

Re: Limit 30,000 rows: our response

Downloaded Data Prep and installed. Install went well. First Excel file was 77,000 rows. Loaded 30,000. Found this message and implemented changes as 300,000 then at 3,000,000. Data Prep shows 30,000/30,000 in the top corner. Using Data Prep version 1.3.0. Tried closing all browsers, deleting the loaded files and reloading, deleting the preparation.  At a loss as to what to try next.

 

 

Four Stars

Re: Limit 30,000 rows: our response

Restarted computer and the issue was resolved. Appears there is a background process that reads the config file once per computer start.