Listing values missing from lookup tables while loading?

One Star

Listing values missing from lookup tables while loading?

Hello.

I'm loading rows from a csv file, transforming them a bit and
loading into a DB, so on a high level my process looks like this:

tFileInputDelimited_1 --(row1)-> tMap_1 --(out1)-> tPostgresqlOutput_1


Once the data is in the DB, some of the columns are joined against
lookup tables to get more readable values in reports and what not.
Well, every now and then now values which aren't in the lookup
tables appear and I'm trying to work out a way to list the new values
from Talend while loading the data. I added a new output out2 to
tMap_1 to which I INNER JOINED with catch lookup inner join rejects
the values that are missing from the lookup table:


tPostgresqlInput_1
|
(row2)
|
v
tFileInputDelimited_1 --(row1)-> tMap_1 --(out1)-> tPostgresqlOutput_1

|
(rejects out2)
|
v
tLogRow

This produced all the missing values to tLogrow all right
(with duplicates, though but guess I can handle that with
Insert or update action in tPostgresqlOutput_1 later) but
it also removed the rows in question from out1.

I would like the missing values to be outputed to out2
but also to output the rows with missing values to out1.
How would you handle described situation?
One Star

Re: Listing values missing from lookup tables while loading?

                         tPostgresqlInput_1
|
(row2)
|
v
tFileInputDelimited_1 --(row1)-> tMap_1 --(out1)-> tPostgresqlOutput_1
|
(rejects out2)
|
v
tLogRow

OK, that latter "image" didn't come out right. Sorry about that.
Four Stars

Re: Listing values missing from lookup tables while loading?

Hi James,

I think two step process is required...
1 - Insert data from flat file to database
OnSubjobOk
2 - Read input file once again, lookup with tMap using Inner join and get the reject output

Will this help?

Thanks
Vaibhav
One Star

Re: Listing values missing from lookup tables while loading?

sanvaibhav wrote:
Read input file once again, lookup with tMap using Inner join and get the reject output

Hi Vaibnav,

and thanks for your reply.

I guess that would work but the destination table, due to
reasons not in my power, is a dump with lots of rows and
no indexes so I was looking for some other kind of solution.

I guess I could try to use tReplicate and process them in
two different branches in lack of more elegant solutions.
I know, it's only ETL.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

APIs for Dummies

View this on-demand webinar about APIs....

Watch Now

6 Ways to Start Utilizing Machine Learning with Amazon We Services and Talend

Look at6 ways to start utilizing Machine Learning with Amazon We Services and Talend

Blog

Why Companies Move to the Cloud: 7 Success Stories

Learn how and why companies are moving to the Cloud

Read Now