Please find attached job design.
Here we have a component tMSSQSP_1 calling a store procedure . This stored procedure returns 10 rows. Now we want to call another stored procedure on these 10 rows one by one.
Now if an error occurs in component tMSSQSP_1 on 5th row then the job gets break which is an issue here. The job should not get break it should continue and process all the 10 rows like the one we could handle in C#.Net using exceptional handling.
Is there anything which we are missing?
You can remove tMSSqlSp_2 component from current job and put it in separate subjob and put tRunJob (Uncheck the property "Die on child error") component in your existing job after tParseRecordSet.
You can define context variable for the data that is needed for tMSSqlSp_2. So for each row coming from tMSSqlSp_1 component, the subjob (calling second sp) will be executed once and if any error occurs in second sp then also execution of first sp keeps going. You can also customize it further. I.e. adding tBufferOutput component to return the success/failure response from child job to parent job so you can do some decision making in parent job.
Job1 : tFilxedFlowInput --> tMSSql_Sp1 --> tParseRecordSet1 --> tRunJob (call job2)
Job1 : tMSSql_Sp2
In case I am developing some complex job which involves interaction with sql server multiple times and in that case there is a probability that there comes some error (e.g. Time Out etc) which may happen at some times.
In above mentioned scenario it means I will have to multiple create sub jobs for such kind of operations which may sound not a good design.
Is there any other way we could handle such scenarios?
Connect a tFlowToIterate to the iParseRecordSet component. This will place all column values from the tParseRecordSet component to the globalMap for each row. It will then iterate over each row. This will allow you to handle processing and control errors between rows being iterated over.
After the tFlowToIterate, connect to a tFixedFlowInput. You will need to set up the schema for this component to be the same as the tParseRecordSet and then populate each column with the globalMap values just created.
Please find attached modified job design.
After making this change it seems that tMSSQLSP_2 is not getting executed correctedly. Reason we are saying this is because earlier rows were getting inserted in to table using stored procedure getting called from tMSSQLSP_2. But now no rows are getting inserted in to table from that stored procedure.
Also could you please explain how tFlowToIterate could handle errors?
The reason nothing was loaded is that nothing was sent. You need a tFixedFlowInput after the tFlowToIterate. An iterate connector does not pass a row of data. Instead, the row of data is added to the globalMap. You use the globalMap data to populate the schema you will need to create in the tFixedFlowInput.
To handle errors you can use an OnComponentError link from the Stored Procedure component.
I believe just putting a onComponentError link won't resolve the problem because component will fail in case of error and it will not process next rows. Adding onComponentError link will only assure you to redirect the flow to somewhere else. I tried it in sample job where it was starting with tLoop component and 3 execution were expected (properties were set likewise) but when the error occurred in first execution, it executed the part which was linked with onCopmponentEror but the second execution did not happen.
You can try for you scenario though.
If you get an error like this, unless the component erroring has a "Die on Error" tickbox which you can untick, the component will cause the job to die. You can carry out a tidy up using a tPostJob component though. These will always run at the end of a job unless the power is literally switched off. Using the iterate allows you more control, but you cannot stop the job from stopping I'm afraid.
Another solution is to use a tMSSqlOutput component using a tMSSqlConnection for a shared connection, to insert your data. This gives you more control over errors and allows you the option if not dieing on an error.
Here is the attached modified job design. Now it is inserting data after adding tFxiedFlowInput after tFlowToIterate.
However once stored procedure getting called from tMSSQLSP_2 throw error it still caused breaking of the job which is worrying to me as my job will have multiple sql operations. so in case if any of the sql operation gets failed then it will break my job.
Also using tMSSQLOutPut is not a solution as in some of my scenarios I will need to execute stored procedure only as there will be complex logics to implement through those stored procedures.
Please help me with this.
So you want all SQL operations in the job to either complete successfully or rollback? In which case you will need to use a tMSSqlConnection component at the beginning of your job, and use a tMSSqlCommit and a tMSSqlRollback.
My suggestion would be to have the rollback connected to your Stored Proc component if it errors....assuming you are not committing inside your Proc....which I suspect you will be. If that is the case you will need to carry out the functionality of your Stored Proc in Talend DB components. At the end of your job, you will need a tPostJob component. This will run at the very end of the job no matter what happens. In this section you can have some logic to decide whether you commit or not.
Its not like that as if we want sql operation to be fully completed or not?
Our scenario is there could be multiple sql operations required in a sql job and if any of the operation fails then the job should not break.
As in my job there are 10 records fetched from stored procedure now I want to perform some operation on these records using another stored procedure and it worked well for first 5 records and in case some error happens for 6th record then at present job got broken. It should not happen. It should start processing from 7th records onwards.
Also we should be able to log record got failed it would be helpful for our investigation.
Your Stored Proc component will cause the job to end on an error. You need to not use the tMSSqlSP component. If you are familiar with how to call your Stored Procedure using SQL, maybe you can use the tMSSqlRow component (https://help.talend.com/reader/9q55KsfASqX0qY4GVhEDNQ/CN_kA~X9QCVwHfTEqqacZA). This component allows you to error without ending the job.
Thanks for the help.
It seems that using tMSSQLRow will help me achieving what I desire. Will let you know my findings once I am done with this.
Watch the recorded webinar!
Accelerate your data lake projects with an agile approach
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.