Postgres data reading stream performance & memory used
Using Big data enterprise edition. Help me with the below job scenario. PostgresInput------------------------------------>File All the ETL transformation was written as SQL query and called inside postgressql input component. SQL output was written to a file in the same job. For triggering the job, we are using ETL server. Postgres DB was present in different server. Following questions hits my mind. a)Ran the job with datasize around 4 gb. Job failed, if i allocate JVM less that 4gb. Talend keeps all the data(4gb) in memory(RAM). It look like pipeline doesn't work in this case, or postgres pipeline streaming doesn't work? If some bytes of data was read, it should write the data in file and release the buffer. right? I'm pushing all the load to DB, which was present in different server. ETL server is used only for I/O process. Why it occupies more RAM in ETL server? Am i missing anything? Please suggest. Thanks
Re: Postgres data reading stream performance & memory used
Thanks. I have set Cursor as 100000. Found the huge improvement in performance. With 1GB of JVM, job completed succesfully without any memory issues. In one case, reading 1 million records from postgres, but i found only 0.2 million were read and job completed successfully. What would be the issue? Thought because of AUTOCOMMIT set to TRUE. But it was enabled in CODE.