I'm new to Talend and I need some supports.
The requirement here:
Get data from Redshift, export valid/error data to CSV file.
Check all columns of each data rows if it match with schema of an existing table.
Export error data to a CSV file, infos need to export are the value of primary key, column name and error messages.
Export the valid data row to other one.
- Check data type
- Check data lenght, if it's a float check the lenght after the comma
- Check date pattern
- Check if the primary key is duplicated with any one in the getted data list.
* If a column has more error, only export to one row in CSV, error mesages are separated by comma.
* If one data row has more error columns, export to multiple rows.
** Here a big note: a data row has more than 1000 column.
Can you give some solution for this requirement ?
Sorry for my English
Thank you so much!
Solved! Go to Solution.
Watch the recorded webinar!
Accelerate your data lake projects with an agile approach
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.