I have a need to run a query on Oracle that results more than 500k records and write to a text file. I have just used two components as below
toracleinput -> tfileoutputdelimited
The current write speed is ~250 rows/second. How can I increase the speed of data write? I have enterprise subscription and "enable parallel execution" on tfileoutputdelimited is greyed out as it seems to be available only for map/reduce jobs.
Solved! Go to Solution.
Firstly, maybe you can tick the "use a cursor" function in the advanced setting in you tOracleInput component. And try different number.
The problem is with your query or your database component config. I've just tested writing a million rows of 600 chars to a flat file and got a performance of 49465 rows per second written.....on a reasonably middle of the road machine. Remove your fileoutput component and add a tJavaFlex (with nothing configured.....just to see the rows per second metrics). How long does that take? Now tweak your query component and your query.
The SQL might run fine, but the Talend database component may not be configured adequately. As I have said, you can write thousands of rows per second to a flat file. That will not be the bottleneck. Have you tested removing the fileoutput component and just connecting to a tJavaFlex to compare the performance?
Can you show us screenshots of your job and your db component config (including the Advanced tab)?
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Part 2 of a series on Context Variables
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema