Can Data Quality analyse unstructured data, such as data in csv file?
Hi , I would like to use Data quality (DQ) to analyse/validate data in CSV files,i.e. highlighting invalid data based on user predefined rules/constraints. I have read Data Quality documentation, Talend Open Studio for DQ provides a powerful data profiling tool for users to analysis database tables, rows and columns with great UX design. However, I could not find any content that describes how to analyse unstructured data, such as content in CSV. If DQ does not provide such functionality to validate data in CSV files, do you have any suggestion to approach my data validation goal? Since it is a open source project, is it possible to extend it to read text files? and then reuse existing data profiling component (defined rules/constraints + validate + highlight invalid data)? Is this trunk the right place I should look at? http://www.talendforge.org/trac/top/browser/trunk.