One Star

Migrate Data from other databases like Oracle to Greenplum

Hi,
I am working on Migrating the data from other databases like Oracle to the PostgreSQL Greenplum database in Job.
What I am thinking to do is using tOracleInput as input with a tMap and then tGreenplumOutput as output.
Any suggestions/tips on this?
Thanks,
A
2 REPLIES
Community Manager

Re: Migrate Data from other databases like Oracle to Greenplum

Hi
Yes, you are on the right track, use a tOracleInput to query records from Oracle, do some transform on tMap if needed, and then load them into target database Greenplum with tGreenplumOutput.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Migrate Data from other databases like Oracle to Greenplum

Hi
Yes, you are on the right track, use a tOracleInput to query records from Oracle, do some transform on tMap if needed, and then load them into target database Greenplum with tGreenplumOutput.

I have a similar map which is resulting in very poor throughput (10 rows per second)
tOracleInput -> tConvertType -> tMap -> tGreenplumOutput
I enabled parallel execution for the map:
At tOracleInput ... Parallelization Partition row to 6 child threads
No change at tConvertType
No Change at tMap (by default it was departitioning)
Enabled parallel execution for tGreenplumOutput in Advanced Settings for the component.
With this parallel execution set up I get 3000 rows/sec. I have 300,000 rows in the Oracle source table. The job finishes in a minute but then hangs for over a couple of hours.
If I reduce the number of records in the source table to 1000 only then the job hangs after execution for 1 minute and then finishes successfully.
By "Job hangs" I mean that the "Kill" button is active and the job exit status is still not visible on the console.
What am I doing wrong? Can the job be created in a way to provide good throughput and finish quickly?