In tfileInputDelimited, there is an option to check if number of columns are less or more than the defined schema in advanced settings tab. This helps a lot in rejecting rows which are not correct as per defined schema in the initial stage itself.
But I am not able to find similar option in tfileInputExcel. My excel file has 'n' columns, i want to reject the records which have columns not equal to 'n'. Is there any way to do it? I don't want to convert excel to delimited file and perform the check and tschemaComplianceCheck won't check for number of columns mismatch. Thanks in advance.
If n is equal to 4 for example, your schema should have 4 fields corresponding to column A, B, C and E.
If a record contains less than 4 columns, field E will be null.
If you suspect your Excel file to contain more than 4 columns (for example 5), you can reject all the records for which field F is not null.
This is a simple workaround to check for record compliance.
I use it for simple use cases. Maybe there is some limitations I didn't encountered yet.
Hope this helps.
I understand that's a workaround. But we can't surely tell if 5 columns will come or 100. In my scenario, i get weird source files. I think for now I will just convert excel to csv and use tfileinputdelimited which has the option to check against schema(number of columns).