One Star

Performing Schema/Data Validation

Hi,
We are developing several integration jobs with TIS 4.2.4 and we are trying to design an approach to validate the data in the input files in the jobs. I used tSchemaComplianceCheck and it works to some extent.
1. If a numeric field in the schema is defined as "int" and the file has a string "abc" in it, talend is skipping such records with a message "For Input String : abc", it does not give the record number or any other clue to identify such records. Is there a way to handle such schema-mistmatched data?
2. If a date field in the schema is defined as "yyyy-MM-dd" and the date in the file is received as "MM-dd-yyyy", the tSchemaComplianceCheck component is catching it and shows in the rejects. Is there a way to use tSchemaComplianceCheck component to catch data issues as mentioned in #1?
Thanks for any info in catching such issues.
Thanks in advance,
Balaji.
7 REPLIES
One Star

Re: Performing Schema/Data Validation

Hi Balaji
Which input component do you use? If you are using tFileInputDelimited, you will find "Check each row structure against schema" on Advanced settings.
Here is the reject row.
.--+----+---------+---------------------------------.
| tLogRow_2 |
|=-+----+---------+--------------------------------=|
|id|name|errorCode|errorMessage |
|=-+----+---------+--------------------------------=|
|null|null|null |For input string: "abc" - Line: 1|
'--+----+---------+---------------------------------'
Regards,
Pedro
One Star

Re: Performing Schema/Data Validation

Thanks Pedro. We use almost all kinds of input files since we are developing several integrations. Some of them are below.
tFileInputPositional
tFileInputDelimited
tFileInputMSPositional
tFileInputMSDelimited
Is this setting available all other file input components? Majority of our input files are tFileInputPositional components.
Thanks,
Balaji.
One Star

Re: Performing Schema/Data Validation

Hi Balaji
I'm sorry. Only tFileInputDelimited supports schema validation.
Regards,
Pedro
One Star

Re: Performing Schema/Data Validation

So, there is no option to find the exact record # of skipped records for other components? Can I post a feature request for this? Our organization is using TIS 4.2.4 as an enterprise level ETL tool and we get the priority support from talend when there is an issue. I can ask our company's Talend Admin group to write a feature request.
Let me know your thoughts.
Thanks,
Balaji.
One Star

Re: Performing Schema/Data Validation

Hi Balaji
Yes. For any new feature, please report it on BugTracker.
Add the "Check each row structure against schema" option for positional components.
Regards,
Pedro
One Star

Re: Performing Schema/Data Validation

Hi Balaji,
You can try this to get the rejected records for the fixed length files (File Positional).
tFileInputFullrow (Get the full record)
|
V
tExtractFilePositional (Give the positions of the fields) ----> Main Row -----> output component.
|
V
Reject rows (It has "Check each row structure against schema" option)
|
V
output component
One Star

Re: Performing Schema/Data Validation

Hi,
I have to fetch few csv files with different schema and store them into a database with a common schema.
how can i do it using a single job?