abort job when duplicates are found

Highlighted
Four Stars

abort job when duplicates are found

Hi, I have a situation when I have to abort the job when the duplicates are found and not load it to output file.  

I have below logic:

 

In case duplicates are found, the job dies. but in case there are no duplicates, the job doesnt load the output file.  I pass a dummy variable on tfixedflowinput(which i think i m doing wrong), as it doesnt update the file with 4 fields from source. how to do it..? Thanks! 

pic.png

Seven Stars

Re: abort job when duplicates are found

it is better to make use of the after variables of the tUniqRow which tells the number of uniques and number of duplicates found from input , you can keep the uniques in temporary and update only if the number of duplicates are zero, and yes tFixFlowInput would not work that way 

Seven Stars

Re: abort job when duplicates are found

DieIfAbort.JPG

 

something like this would work i guess

Four Stars

Re: abort job when duplicates are found

@CK395, can u give export of this job?

when i do IF, it doesnt give me option of tdie. is there anything special u did..? Neither do i get option for thashinput & thashoutput

Employee

Re: abort job when duplicates are found

Hi,

 

     Here is an alternate option to resolve your problem. Add a context variable "row_count" in integer format.

image.pngJob details with taggregate component details

 

image.pngloading count to a context variable

 

 

The tjava component connected to On Subjob Ok is dummy. The if component connected to tDie is :-

 

context.row_count !=0

 

and the if condition connected to the next flow will be having the condition

 

context.row_count ==0

 

Warm Regards,

 

Nikhil Thampi

Four Stars

Re: abort job when duplicates are found

@nikhilthampi - Thanks for the logic. what did u put in tfixedflow input..?

Employee

Re: abort job when duplicates are found

Hi,

 

I didn't notice that earlier skeleton diagram was missing the data capture of unique records. 

 

You can store the unique records in a format of your choice (either as file or hash) and read them later for downstream processing.

screeshot.jpg

 

 

If the answer has helped to answer your query, please mark the topic as resolved. Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi

Seven Stars

Re: abort job when duplicates are found

my bad i didn't see the choose files button earlier,

Anyways, the job is attached, make sure to change the file paths and configure components according to your schema.

 

 

let me know if it fulfilled your requirement or not.

 

 

regards 

Chandra Kant

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog