My query is something like this:- I need to perform this join only if any of the a,b,c,d values are NULL in table t1. And if they are null , i do a lookup for them in table t2. Select a,b,c,d from table t1,table t2 where t1.a=t2.a OR t1.b=t2.b AND t1.c=t2.c OR t1.d=t2.d. I perform this query using a tMap . How can i do this in my expression builder. Is it possible to have OR operator across joins being made for various columns?
Thanks Shong .... The table t2 , which is being used to do a lookup just in case any value in table t1 is null. Now this table t2 has some 8 millions rows. Talend first reads the entire table t2 ??? Which takes a lot of time. I just want a lookup to happen , if after reading table t1 , it finds that any of the value is null(This condition is specified in the expression builder or row1). Then , A join is being made between these two tables as suggested by you (in the expression builder of row2 (table t2)) And if a join is being made, the output table (say op1) has all the fields of row1 mapped to it , except fields a,b,c,d (any of which might be null , which invokes this join) If none of the fields of table t1 is null, i don want any join to happen .... here is how my job looks like:
Have attached my job and tmap .. however these images may be wrong, So please go through the pains of reading some explanation of what i wish to do . Thanks a ton for your help!
1. Read from t1 continously and aggregate data further. However : If any field values from t1 is null , do a lookup from table t2 . Where should i check if any of the values of columns coming from t1 are null .... As per the tmap shown , i did it in expression builder of row1. Now , if any of values of t1 are null , i use my tmap to do lookup for those fields based on : (row1.trackAolId==row3.track_aol_id) or(row1. trackTitle==row3.track_title)) And ((row1.albumAolId==row3.album_aol_id)||(row1.albumTitle==row3.album_title)) and so on... Where do i put this condition , i put it in row3's expression builder.
Does my job diagram fulfill my purpose. The problem is , this job starts with reading all the rows of table t2 ...... and then reads t1 .. performs a join and so on.... I want this to be like : firsgt read t1 , everytime you encounter a null column value , do a lookup from table t2 and carry with the aggregation........ Is some kind of iteration also involved here ? Can you gimme some ideas as to how my job can be designed to achieve all of this.
Hi aviator, my suggestion would be: Because TOS does not support a back propagation of any values from the tMap to the lookup row I would use two steps. first part: tDBInput(t1) -> tFilterRow(check for missing fields) -> allFieldsOk -> tFileOutput(tempFile1)
-> fieldsNotOk -> tFlowToIterate -> tDBInput(dynamic SQL to read one row with missing data from table2) -> tMap (1::1-mapping output metadata like your hhh-connection, value set by input row for missing values or read from the globalMap for the primary row) -> tFileOutput(tempFile2) second part: tFileInput(tempFile1) |--> tJoin -> your flow based on all completed values tFileInput(tempFile2) With this you avoid the prefetch of the data of table2
Ok Thanks ! Is this what u mean , have attached a image , please check it.... This does not help ... the job starts reading my mysqlinput_3 (a table of 8 million rows ) first ....... But i want this to be read only if some null values are encountered from input_1 and a rejected values are sent through the filter....
IDEAS ??? THOUGHTS???
Other option: (Just in case the above suggested does not work ) or can i write a routine , that takes as input the rows returned by t1 and checks for null fields, if any field is null-it just performs a lookup in the table t2 .........creates a temporary table and updates all the null fields after lookup from t2.... and then carries on the aggregation
Hello aviator, remove tMysqlInput_3. You don't need it in my example. Select your data from your "8-million-table" in tMysqlInput_2 with a dedicated sql. In the tMap you need only one input and one output. Bye Volker
Hi Volker, Is this what u are saying(attached an image below.) The problem is , my query for mysqlinput_2 would be something like this: "select track_aol_id,track_title,track_number,album_aol_id, album_title,artist_aol_id,artist_name from audio_main_song where track_title=row8.trackTitle OR track_aol_id=row8.trackAolId AND artist_name=row8.artistName OR artist_aol_id=row8.artistAolId And album_title=row8.albumTitle OR album_aol_id=row8.albumAolId limit 1" Can i refer row8 fields here ??? (The rejected values) because i need to have them in my where clause while doing a lookup in my "8 million table" I get a error when i do this-----> Exception in component tMysqlInput_2 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'row8.trackTitle' in 'where clause' Also in my tmap : apart from the fields that i am getting from mysqlinput_2, i need row8's rejected fields also (not all of them are null) Can i do that ?? Any inputs please ?