[resolved] Iterate a database query then combine results into single flow

One Star

[resolved] Iterate a database query then combine results into single flow

Hi,
I have a situation as follows:
(logic)--main(row1)--tFlowtoIterate--Iterate--tOracleInput--main(row2)--(more logic)
Basically, in the beginning of the flow, I perform logic to get a list of ID pairs. Then, I use tFlowtoIterate in order to query my Oracle database to fill in information about that ID pair one pair at a time. The number of ID pairs that I am querying are a tiny subset of the total amount of data, and trying to pull all of the data for use in joining on the ID pair within a tMap causes Out of Memory Java Heap Space, so I need to only pull the data for these ID pairs.
I am able to pull the data I need in tOracleInput, but because it is being done through Iterate, the main output flow connection will only be either 0 or 1 rows because each ID pair corresponds to a single entry in the database. The problem is that I need this flow connection to have all the data in it for processing in the next step, so as it stands, all of the data is pulled, but in later processing, only the single row that was pulled last is actually in the row connection.
Currently, I have it set up that the tOracleInput will send its rows to a tfileOutputDelimted which is set to append to its file. Then, once the tOracleInput subjob is done, I use a tFileInputDelimited to read in that file into the later processing. The only problem with this is that I have to manually make sure that file is deleted between each run, otherwise it would append to data from previous runs and it seems like a decent amount of extra effort and resources to do it this way.
Is there some way that I can easily combine the rows generated on each iteration of tOracleInput back into a single flow containing all of rows with the original ID pairs and their new data without my usage of a file? Or is there a way to avoid iteration all together and query my database one time for all the ID pairs while maintaining the relationship of each ID in the ID pair?
Thanks

Accepted Solutions
One Star

Re: [resolved] Iterate a database query then combine results into single flow

Correct, I am performing a SQL query something along the lines of:
"SELECT id1, id2,  FROM table_name WHERE id1 = " + (String)globalMap.get("row1.id1") + " AND id2 = " + (String)globalMap.get("row1.id2")

It is not possible to have in this scenario one continuing flow. The only way is to collect all datasets into a file or better into a new database table.

So then, it's not possible to recombine the separate sets of rows generated at each iteration back into one flow without the use of an external storage medium such as a file or database table?
Consider in the tMap using reload each row rather than load once. That way you may be able to use tMap with a lookup from the tOracleInput.

Unfortunately, attempting to use my tOracleInput as a lookup is what caused the out of memory java heap exception as it was unable to load in all of the Oracle data so it never even got to the ID pairs.
Consider using tHashMap instead of tFileOutputDelimited. tHashMap will work the same as the external file, but exists in memory. This should be somewhat faster if total data volumes permit.

Thanks, I'll definitely look in to that.

All Replies
Seventeen Stars

Re: [resolved] Iterate a database query then combine results into single flow

It is not possible to have in this scenario one continuing flow. The only way is to collect all datasets into a file or better into a new database table. You use the iteration probably to change the the query.
One Star

Re: [resolved] Iterate a database query then combine results into single flow

You said that the number of ID pairs is relatively small.
Consider in the tMap using reload each row rather than load once. That way you may be able to use tMap with a lookup from the tOracleInput.
Consider using tHashMap instead of tFileOutputDelimited. tHashMap will work the same as the external file, but exists in memory. This should be somewhat faster if total data volumes permit.
One Star

Re: [resolved] Iterate a database query then combine results into single flow

Correct, I am performing a SQL query something along the lines of:
"SELECT id1, id2,  FROM table_name WHERE id1 = " + (String)globalMap.get("row1.id1") + " AND id2 = " + (String)globalMap.get("row1.id2")

It is not possible to have in this scenario one continuing flow. The only way is to collect all datasets into a file or better into a new database table.

So then, it's not possible to recombine the separate sets of rows generated at each iteration back into one flow without the use of an external storage medium such as a file or database table?
Consider in the tMap using reload each row rather than load once. That way you may be able to use tMap with a lookup from the tOracleInput.

Unfortunately, attempting to use my tOracleInput as a lookup is what caused the out of memory java heap exception as it was unable to load in all of the Oracle data so it never even got to the ID pairs.
Consider using tHashMap instead of tFileOutputDelimited. tHashMap will work the same as the external file, but exists in memory. This should be somewhat faster if total data volumes permit.

Thanks, I'll definitely look in to that.