Limit 30,000 rows: our response

Employee

Limit 30,000 rows: our response

We get this question/feedback a lot: why does the tool limit to the top 30,000 rows, or: 30,000 is not enough.
Data Prep Free Desktop loads the entire dataset in memory. 30K is not a hard limit, just a safeguard to stay beyond acceptable response times for the average hardware. As more high-end hardware can handle more rows, and because 30K may be too little for a file or too many for another, in an upcoming upgrade there will be a UI control to let you increase this limit as you see fit.
In the meantime you can play trial & error by changing this arbitrary limit in a config file located here on Windows: \config\application.properties. Just edit the number in your favorite text editor. Sorry Apple users (including yours truly) the similar file on OS X is not as easily editable.
The commercial add-on due in June will feature more sophisticated techniques and scale with large files.

One Star

Re: Limit 30,000 rows: our response

You should consider memory mapped files or page file backed, where memory is handled by the operating system leveraging a well tuned subsystem to handle such cases. Possible in java with NIO.
said that, 10000 is really too low.
I like the tool, but also with 10000 limit is kind of slow - hope for improvement in next versions. thanks
Employee

Re: Limit 30,000 rows: our response

In the commercial version in June we have invested in more sophisticated techniques and you will not be limited to the available memory. Our direction is in fact far beyond paging, memory-mapped or NIO. Our direction is unlimited scalability with Spark and Hadoop (we already know this stuff see here).
As far as the Free Desktop version is concerned though, what would be a reasonable limit in your opinion? Other forum participants feel free to cast your vote ;-)
One Star

Re: Limit 30,000 rows: our response

My organization just purchased licenses for Talend Data Integration a few months ago.  I was very excited when the Talend Data Preparation module became available.  I do understand that it is free, so beggars can't be choosers, but 10,000 rows is too low.  We plan to use this module when it is no longer free, and I'd like to use it now for building "recipes", but can't because of the 10,000 row limit.  It's causing me to use other tools and work arounds until the row limit is increased. 
I am a Mac user and would be interested if there is a way to increase it.  50,000 or 100,000 rows would be helpful even if the performance was slowed.
Employee

Re: Limit 30,000 rows: our response

In the next upgrade(s) you will have full control right from within the UI to increase the 10K limit -- at your own risk.
I am not even sure the config trick is possible on Mac, because the JVM on OS X obliges us to wrap the config file into a Jar (perhaps to avoid too easy injection of a file on the class path?). You could unjar it, make the change and rejar. But the app is signed for Gatekeeper, so if you tamper with the application, it might complain and keep you from running the app as a security measure.
In the screen capture below you can see where the config file is within the .app bundle.
One Star

Re: Limit 30,000 rows: our response

When will the next upgrade be available?
Also, in the current version, how do you increase the row limit for a Windows machine?
Thanks in advance
Employee

Re: Limit 30,000 rows: our response

Hi
We will release quarterly. We looking an update in number of weeks.
Did you try to edit the following file on Windows?
File located here on Windows: \config\application.properties. 
Just edit the number in your favorite text editor.
One Star

Re: Limit 30,000 rows: our response

I did test it and was giving mixed results in 1.0.
But 1.0.1 is much faster and it works OK even with 100,000 lines.
One Star

Re: Limit 30,000 rows: our response

In the current version 1.1.0, I'm only able to download 3000 records rather than 100 000. Has this setting been changed or can I change a setting on my computer?
One Star

Re: Limit 30,000 rows: our response

I have version 1.1 and have edited the file mentioned above and I still am limited to 30,000 records. How an I remove this limit? It this is the limit I will need to do my profiling directly in excel.
One Star

Re: Limit 30,000 rows: our response

Hi,
i'm using Talend-DataPreparation-Free-Desktop-windows-1.3.0 and the limit to 30000 still remain despite of modifying the application.properties file.
Any idea of how to fix it?
Thanks for your help.
Four Stars

Re: Limit 30,000 rows: our response

Downloaded Data Prep and installed. Install went well. First Excel file was 77,000 rows. Loaded 30,000. Found this message and implemented changes as 300,000 then at 3,000,000. Data Prep shows 30,000/30,000 in the top corner. Using Data Prep version 1.3.0. Tried closing all browsers, deleting the loaded files and reloading, deleting the preparation.  At a loss as to what to try next.

 

 

Four Stars

Re: Limit 30,000 rows: our response

Restarted computer and the issue was resolved. Appears there is a background process that reads the config file once per computer start.

Four Stars

Re: Limit 30,000 rows: our response

Hi Kyle, I am also stuck with same issue. We have configured a job in talend studio which passes csv file that holds more than 70000 records using tFileInputDelimited component to tDataPrep component, But it goes in infinite loop and even with blank recipe and nothing is coming as an output. Is there any workaround to divide csv file in multiple files?

Four Stars

Re: Limit 30,000 rows: our response

What is the cost of the commercial version of Data Preparation and Data Quality tools.  How do I get the commercial version.  

If you have a matrix for pricing & features for all your tools it would help

Moderator

Re: Limit 30,000 rows: our response

Hello anilalm,

Could you please send an email to sales@talend.com with your requirement? Our colleagues from sale team will assistant you to optimize product pricing.

Feel free to let us know if it is Ok with you.

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Four Stars

Re: Limit 30,000 rows: our response

Hi,

 

I'm trying to use data preparation free version 2.1.1.

I'd like to unlock the upper limit of the dataset records by modifying application.properties.
But the upper limit was not changed even by reboot Windows 7.
If you have any other way to change it, please let me know.

 

Best regards

Akira

Community Manager

Re: Limit 30,000 rows: our response

Hi Akira,

 

This setting is the only way to change the limit. It indeed needs a Data Prep restart to be taken into account, but it also applies only to the new datasets added after the limit has been changed. It is not retroactive.

 

Regards,

 

Gwendal

Four Stars

Re: Limit 30,000 rows: our response

Hello Gwendal,

 

Thank you for reply.

I understood it needs to restart the app after modifying the setting file.

I already did it and I also restarted PC because it might run as background task.

Of course, I uploaded the dataset with around 50K records after reboot.

But nothing changed...

 

 

Community Manager

Re: Limit 30,000 rows: our response

Hi Akira,

 

That is odd - nobody ever faced the issue.

 

Can you confirm that:

  • You're using Talend Free Desktop 2.1.1.
  • You updated the file config\application.properties of that version (and not the 1.3 version or older which might still be installed).
  • You've input the size without any separator - i.e. dataset.records.limit=50000.

 

And just in case you installed in the default folder: you need administrator rights to update the configuration file. Can you re-open the file and check that you do see 50000 and not 30000?

 

Apologies if the questions seem basic/dumb, but I'd rather rule out any obvious cause before looking at something more fishy.

 

Thank you,

 

Gwendal

One Star

Re: Limit 30,000 rows: our response

Where exactly is this file located? I can't seem to find it.

One Star

Re: Limit 30,000 rows: our response

Where exactly is the file located? File name and path
Community Manager

Re: Limit 30,000 rows: our response

As mentioned in the first post, the file name is application.properties. It is located in the config folder in the Data Preparation installation folder. If you used the default installation path, then the path is C:\Program Files (x86)\Talend\Talend Data Preparation Free Desktop 2.1\Talend-DataPreparation-Free-Desktop-windows-2.1.1\config\application.properties

 

Regards,

 

Gwendal