copy data from oracle to paraccel databases

One Star

copy data from oracle to paraccel databases

hi all,
I am trying to see if talend can help me move data from oracle to paraccel (postgres).
Say I have 1000 tables.
So far as I can see it, I can create a "process" to move each table. This would be something like:
tOracleInput, tFileOutputDelimited and tParAccelBulkExec
Works great! HOWEVER...
to set this up for 1000 tables, even if need to modify table name - looks ugly.
Question to you Talend gurus: is there a way to have a list of tables and then for each table perform the above process(extract, save file, bulk load)?
Thanks for your help/
let me know if to answer this question, you need more info (system, etc)
thankS!
One Star

Re: copy data from oracle to paraccel databases

Hi,
because TOS is mainly metadata driven this would not be possible. The only solution I see would be to generate the SELECT statement on the fly to get one large string which you could save in the next state. But I think this would be to complex (if it is possible at all).
I don't know both databases in detail. Could you make a unload of oracle and modify in any way it to load it back in postgres?
Bye
Volker
One Star

Re: copy data from oracle to paraccel databases

Yes, I could do it. For initial bulk load, it as simple as have a loop on the list of table names, for each of them extract DDL, transform it to Postgres format, extract data, execute DDL on the target, copy data to target. All is a one loop with 4 functions more or less. Instead of doing 1000 different processes in Talend. If it is really not possible, that I would not consider Talend for this purpose. However, later on when the ETL runs every day and we would need to incrementally load/delete data, this becomes more complex with a script, Talend could be of help (if that loop in the beginning could be possible). Is there a relatively easy way to script you own object that would perform an operation on a set of tables?

Re: copy data from oracle to paraccel databases

It is. You can make a system call to execute your script, or embed the code into Talend. (using the Code section of your metadata or a tJavaRow or tJavaFlex (or tPerlRow, tPerlFlex if you're using a real man's scripting language Smiley Wink )
One Star

Re: copy data from oracle to paraccel databases

So why use Talend at all? Where is the advantage between Talend and home grown solution? In the end you will end up writing your own code. Why not use "real man's scripting language" to do the whole load manually? It's a 1-page script (at least for initial load)...
One Star

Re: copy data from oracle to paraccel databases

What should I say? Not for all problems TOS would be the best solution. There are many other use cases where TOS would be a better way to do it.

Re: copy data from oracle to paraccel databases

The major advantage of Talend over a homegrown ETL solution is the repository, and that the majority of ETL tasks can be completed in Talend without the need for custom code. When you encounter a problem that "pure" Talend cannot solve efficiently, you have the ability to implement custom solutions.
Talend gives you a tool and a framework to organize, standardize, and monitor your ETL project. For the specific job you've described, using a "pure" Talend solution will probably not be efficent, but as we all know ETL often grows to more than just one job, and this is where using Talend shows its value.
Employee

Re: copy data from oracle to paraccel databases

Thank you Volker Brehm and JohnGarrettMartin for making clear why an ETL such as Talend is useful in every day data integration processes. I want to make clear that Volker and John are not Talend employees, so there is no "marketing" in their posts ;-)
Personnaly, even if I know Talend Open Studio quite well (let's even say "very well") I don't replace each of my manually written script with a TOS job. If there is no file processing, database connection, sorting, deduplicating and other actions TOS does very well, I also like my Perl scripts :-) But obviously, when I need to process data, a TOS job is the best solution for me.
- 1st example, today I've updated my script to backup Subversion repositories with an incremental method (see first revision of this script), this is script would be painful to write as a TOS job, because it processes no data.
- 2nd example, the week before, I've designed a job to retrieve Apache logs from our web server, retrieve an updated list of search engine robots IP addresses, read new Apache logs, filter on viewtopic.php page, extract IP address, add the country location, fill our forum_views {year, month, day, IP, country} stats table (I will soon distribute this job), a TOS job was just great for doing this and has saved me a lot of time.
The problem I see with your need is that each of your 1,000 tables has a different schema, I mean a different list of columns. In Talend Open Studio, the schema is set at design time (because of the repository and anyway it was a design choice for us to make component configuration much simpler). So what? We have to create a method to generate jobs (not generate scripts, we already do that) from a template. This is in our todo list, and is has been discussed several times in the Talend R&D team. I just want to make you know that we are aware of this issue and we have solutions in mind.
Employee

Re: copy data from oracle to paraccel databases

kirylm,
I think that we provide into the Studio some components to industrialize your need.
For example the components tOracleTableList; it gives you opportunities to Iterate on each tables you want copy. For each one you have several ways to fast export data to delimited file; then BulkLoad data into your ParAccel (as your described scenario but more GENERIC).
Or another way; could be for each Tables to call Oracle API to DUMP the data (tSystem or tssh); then bulkLoad DUMP into ParAccel (probably more complex).
I like the way to use Talend to design my Data Integration need, to understand and explain to people the data migration processus; and to maintain easily this graphical approach anyway.
Best regards;
One Star

Re: copy data from oracle to paraccel databases

I cannot find tOracleTableList in Talend...
One Star

Re: copy data from oracle to paraccel databases

It is available since TOS 3.0.x with Java code generation.
You could also do a select on the system tables and iterate over the result:
tOracleInput --(row)--> tFlowToIterate --(iterate)--> ...
But from my point of view this will not help you out at all. There are two solutions (less or more without Talend Open Studio):
- export and load the data for each table (as Christophe described)
- use a select statement which will return an insert statement (string concatenation) and execute it against the destination database.
- using a remote connection and work with "INSERT INTO SELECT FROM"
Bye
Volker
Employee

Re: copy data from oracle to paraccel databases

Volker,
What about made Two JOBS with ELT+SQLpattern and BulkExec usage.
- First Job : Iterate extraction on every Oracle tables.
To do that : tOracleTableList ---iterate---> tELT component (table = GlobalVariable CURRENT table return by OracleTableList); then write a SQLPattern to extract and load data like a fastExport provided by SQLLoader. It generates a CSV/Delimited file for each tables.

- Second Job : Iterate on Delimited Bulk file and call a tParAccelBulkExec component (PARRALEL mode to increase performance activated)
To do that : tFileList ---iterate---> tParAccelBulkExec (globalVariable from tFileList for Bulk filepath property).

Another way is :
- First Job : tOracleTableList ---iterate---> tSystem or tSSH or tStoredProcedure
- Second Job : tFileList ---iterate---> tSystem or tSSH or tStoredProcedure
I agree that it's less integrated with Talend.
But probably to involve Talend usage; you can add tWarn, tFlowMeter for your Monitoring, or define FailOver or RollBack condition for your business rule or workflow constraint.
Best regards.
One Star

Re: copy data from oracle to paraccel databases

Hi Christophe,
yes, thats why I love Talend, you could do much more than just simple use data transformation. And yes your are able to add some "basic infrastructure" components. My point was that this is not "real" Talend like with dynamic metadata for examples (as ask in the initial post).
Last week I thought about writing a podcatcher job to feed my mp3 player. This would be a good example to show how flexible you are with Talend ;-) (If I get some time and it will work I should post about it ...)
Bye
Volker
One Star

Re: copy data from oracle to paraccel databases

.....
then write a SQLPattern to extract and load data like a fastExport provided by SQLLoader. It generates a CSV/Delimited file for each tables.
.......

Could you please post an example of the SQL template/pattern?
thank you very much
One Star

Re: copy data from oracle to paraccel databases

hi everybody
could you post the solution, i have the same problem and this could help me.
sorry my english i am learning now.