One Star

How to handle rejection when 'Batch Size' enabled

Hi experts,


Actually, I'm using Talend 5.5.1 for my data migration project.I have a requirement such as configuring 'tOracleOutput' component with BatchSixe enabled and Rejection flow.However, I see, it is not possible.

 

Could you please give me a way to handle rejections when the tOracleOutput with Batch size enabled?

Thank you

 

Charith

2 REPLIES
Seven Stars

Re: How to handle rejection when 'Batch Size' enabled

Most common reason for  data rejections:

 

  * Duplication of keys

  * Data Truncation

 

For duplication of data , can be easily identified by custom code.

For Data truncation , you can make use of component tSchemaCompilanceCheck .

 

Hope this helps.

 

Note: Reject options gets enabled only Die on Error and Batch Option is not enabled. 
Hope this helps.

 

Six Stars

Re: How to handle rejection when 'Batch Size' enabled

Hi Charith,

 

As @ashifa says, you can handle the most common causes for inserts being rejected by the server in your Talend job, and if you want batch mode enabled, you'll have to, but it's always possible that an insert will fail for some reason which you're not handling, and you won't be able to get anything meaningful unless your output component has the "Rejects" output enabled.

 

Whilst you may already be aware of this, I thought I'd re-post my reply to another thread earlier which includes an explanation as to why the "Rejects" output isn't always available.

 

If you're using t*Row components, there will already be a "Rejects" output, however if you're using t*Output components, you'll need to un-tick "Use Batch Size" in Advanced Settings in order to enable the "Rejects" output.

 

In batch mode, Talend sends batches of say 1,000 inserts to the server at a time, and this means that any errors returned aren't usually specific enough to be useful, or to identify the row in which the error occurred, so the "Rejects" output is disabled.

 

With batch mode turned off, each insert is sent individually, and you'll get meaningful errors back from the server, so "Rejects" is enabled.

 

Batch mode is of course quicker, but not an option if you need to know when individual inserts are failing.

 

t*Row components always operate on a per-row basis, and so will have the "Rejects" output available.

 

There is a "Die on error" option for all of these components, but you'll normally want to deal with errors properly, and at a minimum, log them and continue, rather than aborting the job.

 

Regards,

 

 

Chris