Loading Flat File With Dynamic Schemas

Five Stars

Loading Flat File With Dynamic Schemas

Hey all, I am fairly new to Talend Integration.  This is my first post, looking to gain some insight.  We are migrating some of our current jobs from Business Objects.


So here is our scenario: we load flat files (tab delimited) from our customers.  Each file has a header record containing the field name (first_name, last_name, etc.).  We allow our customers to send only the fields that they require, in any order.  I am a bit lost on which components I can use to read each file, dynamically figure out what the schema is and load the data into a postgres table.


For instance, our first file might contain firstName, lastName, and email. 

The second file might contain email, uniqueId, lastName, firstName, and loginId.

The table we load this into contains the appropriate columns.  Each file could contain a different quantity of columns.  We have over 2000 files and can't build a custom mapping for each client.


I guess I am a bit confused about what components are the ideal way to make this work.  I have tried using tjava, but my java skills are not where they need to be.  I have also tried using tSplitRow to create a data dictionary.  Additionally, I have tried using tExtractDynamicFields, but couldn't get it to work.  Any help in pointing me in the correct direction would be super appreciated!


Lastly, if I am looking for a course or book to help my Java skills, especially related to Talend, can anyone make a recommendation?  I don't want to become a java expert, but would like to expand my knowledge from where it currently is at (beginner).

Forteen Stars

Re: Loading Flat File With Dynamic Schemas

in case when target database structure same to all files, and You only need manage flexible CSV structure - situation could be resolved.

Variant 1

  • Convert csv to json - for each file

it could be done with tJavaFlex (if You familiar with Java) or with any of available command line tools, like:




  • Read file raw by raw (document by document) and parse it with


in this case - if column there missed it would contain NULL


Variant 2

Check https://exchange.talend.com for component tFileInputTextFlat  Jan Lolling 


define for component structure with all target columns, and it do same as Variant 1 - extract if present or Null

In both cases - column names must be same for all files, no doubts  



Talend named a Leader.

Get your copy


Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables


Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema


Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables