I am trying to delete about 130,000 tasks from a resolution campaign on a daily basis and reload the same number of tasks the next day. I understand that this is not the ideal way to use stewardship, but that is the clients requirement. When ever i try to delete the 130K tasks using the tDataStewardshipTaskDelete component, it throws a java heap space error. I am trying to load the delete records to a csv file on the jobserver. I tried to change the jvm settings on the job to use upto 2gb max. Still fails.
Is there any better way to deal with this? is there any way to delete records in Batches so that we don't run into the heap space issue?
You can extract the task lists into a file as first step. The read the file in iterative manner so that the entire data set will not be pushed to tDataStewardshipTaskDelete component in one shot.
Please add the query section and other relevant sections in taskdelete component using the data from input file to filter the data properly.
The response time might be more since we are doing the data deletion at row level rather than bulk level.
Another idea will be to push data into multiple files as batches and read one batch at a time.
Watch the recorded webinar!
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.
Test drive Talend's enterprise products.