How to process changing data structure?

 Overview

This article introduces the dynamic schema feature.

Dynamic schema allows you to design Jobs with an unknown column structure (unknown name and number of columns). If necessary, dynamic columns can be mapped directly to the target using Pass-through mode.

The main application of this functionality is a replication scenario or a simple one-to-one mapping of many columns. This feature makes designing these types of Jobs easy. For example, if you need to migrate a whole database with hundreds of tables, you can do so without explicitly including the table structure, using a single Job.

This article demonstrates the dynamic schema feature through two examples:

  • Read a table with dynamic schema 
  • Read a file with dynamic schema

Environment

This procedure is compatible with all versions of Talend subscription products.

List of components that support the dynamic schema feature

Not every component supports the dynamic schema feature. Refer to the Knowledge Base article: Which components provide the Dynamic Schema feature? to know which components provide the dynamic schema feature.

 

Procedure

Read a table with a dynamic schema 

This example will extract data from a table called person. The table has many columns and contains the following example data. The objective is to write this data to a file. 

id
name
gender
email
department
entry_date
1 Jeffe Boy jeffe@company.com R&D 2012-05-15
2 shong Boy shong@company.com Community 2011-08-07
3 Johnson Girl johnson@company.com HR 2014-04-16
4 Joyce Girl joyce@company.com Test 2012-09-10
5 Truman Boy truman@company.com Marketing 2013-02-27

In Talend Open Studio for Data Integration, the community version, you have to define each column in the schema to map the real table structure. If the table structure is unknown, it is impossible to read the table.

In Talend subscription products, there is a Dynamic type in the Type list, as seen in the following figure:

Using this type, you just need to define only one column with the Dynamic type instead of defining all columns.


Example Job

Create an example Job called ReadTableDataWithDynamicSchema. The Job contains two components:

tMysqlInput_1: read "person" data from database.

tFileOutputDelimited_1: write the data to a text file with header.

The detailed Job settings can be seen in the following figures:

tMysqlInput_1 settings:

tMysqlInput_1 schema:

tFileOutputDelimited_1 settings:

Execute the Job

Execute the Job and open the generated file (In this case, it is "D:/file/out.txt"). Verify that all data is written to the file with the correct column names. The data shows as follows:

id;name;gender;email;department;entry_date
1;Jeffe;Boy;jeffe@company.com;R&D;15-05-2012
2;shong;box;shong@company.com;Community;07-08-2011
3;Johnson;Girl;johnson@company.com;HR;16-04-2014
4;Joyce;Girl;joyce@company.com;Test;10-09-2012
5;Truman;Boy;Truman@company.com;Marketing;27-02-2013

Read a file with dynamic schema

Here is another example to explain how to read data from a file with the dynamic schema feature. In this example, you will read the file which was generated by the above scenario.

Icon

When reading data from a file with a dynamic schema, the first line of the file must be the header that will be read as the column names at runtime. In this example, the first line of the file is the header.

id;name;gender;email;department;entry_date

Example Job

Create an example Job called ReadFileDataWithDynamicSchema. The Job contains two components:

tFileInputDelimited_1: read data from the text file.

tMysqlOutput_1: create a table in database and insert the data.

 tFileInputDelimited_1 settings:

tMysqlOutput_1 settings:

Execute Job

Execute the Job. Verify that a table called person1 is created in the database and the data is inserted.

Related Files

 

  File Modified
  ZIP Archive DynamicSchemaDemoJobs.zip May 28, 2014 byShicong Hong
Version History
Revision #:
1 of 1
Last update:
‎04-17-2017 10:43 PM
Updated by:
 
Labels (1)
Contributors