I have a need to run a query on Oracle that results more than 500k records and write to a text file. I have just used two components as below
toracleinput -> tfileoutputdelimited
The current write speed is ~250 rows/second. How can I increase the speed of data write? I have enterprise subscription and "enable parallel execution" on tfileoutputdelimited is greyed out as it seems to be available only for map/reduce jobs.
Solved! Go to Solution.
Firstly, maybe you can tick the "use a cursor" function in the advanced setting in you tOracleInput component. And try different number.
The problem is with your query or your database component config. I've just tested writing a million rows of 600 chars to a flat file and got a performance of 49465 rows per second written.....on a reasonably middle of the road machine. Remove your fileoutput component and add a tJavaFlex (with nothing configured.....just to see the rows per second metrics). How long does that take? Now tweak your query component and your query.
The SQL might run fine, but the Talend database component may not be configured adequately. As I have said, you can write thousands of rows per second to a flat file. That will not be the bottleneck. Have you tested removing the fileoutput component and just connecting to a tJavaFlex to compare the performance?
Can you show us screenshots of your job and your db component config (including the Advanced tab)?
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.