I have a number of excel files (not sharable) from which I check using a tSchemaValidation before connecting to a tMap. The rows that are rejected are saved to an excel file and the error message states
ColumnName:exceed max length
When I check the columns they do exceed the max length but I believe its because the preceding column (ColumnNameMinus1) contents is now in ColumnName.
Both columns are strings.
When I check the source data the ColumnNameMinus1 contains the correct value which is showing up in ColumnName when I ingest it
What could be causing the column contents to get mixed up when reading from excel?
Which version of Talend are you using? Can you send me an example excel file for testing?
I'm using Talend 6.3.1 but unfortunately I can't share the input data
I thought I had added the following details to the thread
I checked some of the files which were tripping the schemaValidation and it turns out that my files have different schema (some 11 columns, others 14) I thought applying the schema to the tFileInputExcel component would trip an error at that stage.
What is the best way of handling the situation where I have multiple schema for the same type of file?
Are you referring to multiple sheets from excel? Why don't you check the 'all sheet' box on tFileInputExcel to read data from all sheets? Or you have multiple sheets with difference columns?
Can you please give us some example data which will be helpful for us to understand your situation?
I have multiple excel files with 1 sheet each (File1.xls,File2.xls...File20.xls) the issue (I think) is that the schema is consistent for "most" of those files (11 columns) but for a handful the input increases to 14 columns.
I've imported the most frequent schema into the repository but I can't seem to associate multiple schema (for the delta schema) with the same Excel File so it looks like I need to have multiple files.
Is the best way to handle this situation to process the reject flow from tValidateSchema and compare it to the next most frequent schema (14 columns) by replacing the tMap in this image with a tValidateSchema or is there a different way?
I've tried the following to Validate the input against multiple schema designs so the data is processes correctly but I've run into problems. In the job below
the top line attempts to read an excel file and compare it to the Schema (Dec2016), I want to configure the error handling so that the next line is triggered to compare the same file against another schema (13Columns)
Would a solution be to replicate the tFileExcelInput and test each flow against a different schema
tFileInput -- tReplicate ----CheckSchema-Dec2016 -onComponentOk ---tMap
\ CheckSchema-13Columns -onComponentOk ---tMap
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Part 2 of a series on Context Variables
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema