Joining Large Tables

One Star

Joining Large Tables

I have 20 tables that I need to join in a map. The output needs to go into a XML file.
I have defined the map as:
Oracle input tables --> tmap (joins the child tables and primary tables based on keys) --> tAdvanceFileOutputXML (has the xsd for generating the output XML file). The process runs fine for small data sets, but when the data volume is large (15-25 million records) then tools fails giving out of memory error.
TalenD is getting rows from each table individually and then joining the rows in the tmap transform. So, for large data sets it is failing.
Is there any way to have TalenD send a consolidated SQL query to the database so that the heavy joining is done on the database and not in TalenD.

An alternate approach I tried was to pass the combined SQL query (for all the tables) by manually entering it in the Query section of the tDBInput component. However, when I click on the Guess Schema button, I get the following error: "Connect to DB error, or some errors in SQL query string, of Guess Schema not compatible with current SQL query string"
The query works in TOAD and is a simple one, pulling data from just two table with a simple join condition.
Is there any solution that you can point me to?
One Star

Re: Joining Large Tables

By default Talend tries to load all the data needed to do joins/lookups in a tMap into memory.
There are some store on disk settings - where you specify a temp file name to use - and how many rows to store in memory/on disk.
However, if you're working w/lookup tables that have a large number of rows in them - and there are 20 tables. hen you are going to have performance issues.
Personally, if I'm dealing with a large number of rows, I like to do as many of my joins as possible in the SQL statement of one or more tDBInput components.
One Star

Re: Joining Large Tables

In fact I tried that approach : passing the combined SQL query (for all the tables) by manually entering it in the Query section of the tDBInput component. However, when I click on the Guess Schema button, I get the following error: "Connect to DB error, or some errors in SQL query string, of Guess Schema not compatible with current SQL query string"
The query works in TOAD and is a simple one, pulling data from just two table with a simple join condition.
Are there any any other settings need to be done?
Thanks.
One Star

Re: Joining Large Tables

Hi yogi9009,
You can create the tDBInput schema yourself. for each attribute you wanna retrieve, create a correspondant column in your schema. good luck
Youssef