Four Stars

How to extract data with tOracleInput based in another tOracleInput

Hi, i have a problem with an extraction and load of data in an oracle db, i have millions of records that need to be migrated, but the strategy that i need to follow is about of data set's based on a certain number of persons, in other words i need to extract and load only the transactions of my set of persons.
Actually i'm doing the extraction and load with and tMap, extracting the set of persons with an tOracleInput that's acting like an lookup, and matching with the other tOracleInput that extracts all the transactional data, it works, but what is the problem? the problem is to extract the data of that set of persons i need to fetch all the transactions in the database (and its a lot of data) so i need to some component or idea to fetch only the info for the people that i need without fetch all the transactions in the database.

In the image below you will see how i actually do the extraction, i only need to match ID of the person with the ID of the transaction.
Captura de pantalla (162)_LI.jpg

 

This is the lookup that i use to match transaction with the people.

Captura de pantalla (163).png


Thanks for your help.

  • Big Data
  • Data Integration
Tags (1)
7 REPLIES
Seventeen Stars

Re: How to extract data with tOracleInput based in another tOracleInput

One way could be to turn it. Use the user input as main input and select for the users the transactions.

Or simply do a join directly in the database to get the desired transactions for the users.

Ten Stars

Re: How to extract data with tOracleInput based in another tOracleInput

These are different databases? If so, any possibility of creating a temporary table to join to the transaction table?

How long is the list of individuals you're targeting? If it's short, could you build a list of identifiers that you could include in your transaction query?
Seventeen Stars

Re: How to extract data with tOracleInput based in another tOracleInput

Exactly, these are good working possibilities!

Four Stars

Re: How to extract data with tOracleInput based in another tOracleInput

Exactly, the persons and the transactions are in different tables (and enviroments), and the set of persons is about 1 million, but the transactional data is about 90 million of records, this is why i have this problem, because the job need to fetch the 90 million records for the set of persons. I would like to have some component to only extract the data that i need (in other case i try to put the id of the persons in the query and let the database do the job, but now i have a sample of 1 million of persons, so this strategy it's not availble any more)

Seventeen Stars

Re: How to extract data with tOracleInput based in another tOracleInput

What kind of component could that problem solve? You have to limit the amount of transactions in the query! This cannot be solved by a component.

You can separate the list of users in smaller groups and use the small list in the where clause of the transaction query. This way you have to rerun the job until you have finished all related users. This would also reduce the amount of data to process - actually a good design.

Four Stars

Re: How to extract data with tOracleInput based in another tOracleInput

I understand that, i need to improve the query, the problem is that is a large amount of persons (1 million) and cannot put then in "groups" in the where clause of the query, so there is some way to dynamicaly put some of the persons rigth in the where clause of the query, based of some number of clients from a different DB? to improve the job and don't fetch all the transactions.

Thanks for you help

Ten Stars

Re: How to extract data with tOracleInput based in another tOracleInput

If there's no natural grouping of the individuals you can leverage, you'll have to construct one. This can be achieved with a Row Number column and a bit of math, in the database if possible to reduce memory overhead and limit how much you send over the wire at one time.

If you intend to create lists that you drop into a WHERE IN clause, Oracle has a limit of 1000 items in an IN list.