Validating a data file against a schema file in Hadoop

One Star

Validating a data file against a schema file in Hadoop

Hi,
I have a requirement to validate a data file in Hadoop against a given schema file. Please note that the schema file is always dynamic. The data and schema file is always a combination.
For example the schema file would look something like this:
MKey|int|30|||
PKey|varchar|25|||
CNumber|varchar|25|||
CStatus|varchar|1|||
DOS|date|10||MM/DD/YYYY|
Above schema file has 5 fields.
The data file looks something like below:
123111111|456||D|12/04/2014
Can someone help me how I can design this in Talend? Any help is greatly appreciated.
Thanks
Ramakrishna
Moderator

Re: Validating a data file against a schema file in Hadoop

Hi,
Have you already checked component TalendHelpCenter:tSchemaComplianceCheck which validates all input rows against a reference schema or check types, nullability, length of rows against reference values.
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
One Star

Re: Validating a data file against a schema file in Hadoop

Sabrina,
tSchemaComplianceCheck needs the schema to be static, but my schema file is dynamic. I have the requirement to validate the schema of a data file against the provided schema file in the format described in my original post.
Thanks
Ramakrishna