Hi Members, I am new to Talend and looking for help from you guys to resolve a problem. I have a requirement like... 1. Fetch emp_id from emp table where emp_First_name = 'Alex' ...assume this returns me 20 records i.e. 20 different emp_id. 2. Now I need to pass this dynamically to other oracle stage where emp_id in (emp_id fetched from previous query). select dept_id from dept where emp_id in (all 20 emp id fetched in query 1) Is this possible in talend ? Your help is deeply appreciated.
Easy: tOracleInput "select emp_id from emp where emp_First_name='Alex'" -row1-> tAggregateRow using list function (or tDenormalize) to concatenate the emp_ids into a comma-delimited String -row2-> tFlowToIterate -iterate-> tOracleInput "select dept_id from dept where emp_id in ("+globalMap.get("row2.emp_id")+")" -row3-> ...
Thanks Alevy, I have more than 300K records coming from source and concatenating it in the string may be performance bottleneck. Is there any option available where i can pass select * from table A where A.colname = "input.colname" where input.colname contains all the records and automatically checks for the value. We have similar stage in DataStage and with Sparse lookup it can be done...Is there any similar thing available in Talend also ?
Thanks Shong. Yes sure..I can do the inner join but the problem will be i will have to take all the record close to 2 million record to the buffer and then lookup with 300K of input records which will be time and space consuming . Is the anyway to filter the record in main table table based on input records as a filter like we do in where clause of Sql ?
Hi Here are three ways could achieve this request: 1, Do an inner join on tMap, to me, 2 million is not a large of data set, this could be the best way with good performance. 2. As Alevy suggested, select data from lookup table and concatenate each data to a string with a comma-delimited, and then, use the in where cause in the query of the main table. 3. Select data from lookup table and iterate each one. tOracleInput_1--main(row1)--tFlowToIterate--tOracleInput_2--main--tLogRow tOracleInput_1: "select record from lookup table" tOraceInput_2: "select record from main table where mainColName="+row1.lookupColName You can test all ways above and see which one is best for you. Shong
---------------------------------------------------------- Talend | Data Agility for Modern Business
If your emp and dept tables are in the same DB you can use ELT components to pass the inner-join query to the DB to execute instead of pulling the data into Talend. If the tables are at least on the same DB server even if they are in different DBs, you could do a combined query in your input component i.e. select dept_id from db1.schema1.dept where emp_id in (select emp_id from db2.schema2.emp where emp_First_name='Alex') or select dept_id from db1.schema1.dept inner join db2.schema2.emp on dept.emp_id = emp.emp_id and emp_First_name='Alex' Having said that, my test of reading 3 million values from a file, tDenormalizing them into a single string and writing them to a new file took only 1560 milliseconds, although reading from a DB will obviously be slower. Failing all those and provided you don't want too many fields in the lookup, I agee with shong that tMap would probably be the best.