HI, We have a scenario where in the input files are without headers and each file can have a different column structure(different column numbers and different columns). The schema for each input file is stored in a table based on the fileid (which we are aware of and can pass as a parameter to the job to retrieve the column schema from the table while processing). Each of these files have certain column like GenderId, LocationId, etc which have to be resolved to their names from their respective lookup table and looked up data loaded into another file. The table also includes a column which will indicate if the column has another crosswalk table to refer to (e.g. for resolving the gender, Location Names, etc). Number of tmaps is unknown as the cross walk can be different for each file type. I believe this requires custom coding but wanted to check if there are any thoughts on how the files can be read based on the input schema in the table, the crosswalk referenced column resolved to their names and the file finally loaded into a output file.
What is needed: Fileid is known before the run of file. File has to read the input schema from the table Schema_Mapper, check if the column is a cross walk, join the column value with the cross walk table and generate the output. The final output file should be as below: Jon,Male,Alabama,10000 Alice,Female,Florida,20000 Jane,Female,Kansas,30000 Smith,Male,California,40000 Thank You!
Hi, Are you looking for Talend dynamic schema feature which allows you to design schema with an unknown column structure (unknown name and number of columns). Could you please take a look at document about TalendHelpCenter:How to process changing data structure? to see if it is satisfying your needs?
Best regards Sabrina
-- Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.