A new version of Talend MDM has been released, and you wish to migrate from an older version of Talend MDM to this new version. Talend MDM is one of the few products in the Talend suite that contains both user-developed artifacts (such as models and roles) and actual business data. The other products where parallels can be drawn in terms of migration requirements are Data Preparation and Data Stewardship, but these are considerably simpler applications, and thus the migration process for these products is also simpler.
There has been considerable misunderstanding and misinformation on the topic of MDM migrations in the past. This document outlines in detail the Talend-supported approaches to MDM migration. Obviously, MDM is part of a wider platform of products: Data Integration (DI), Data Quality (DQ), Enterprise Service Bus (ESB), and Big Data. We may have migration actions to perform with all of these platform components, but the scope of this article is limited to MDM.
An MDM migration will usually involve migrating your MDM objects: Models, Roles, Views, and so on. However, assuming that:
then deployment of these objects onto the new version of the MDM server is simply a case of performing the deployment using either the new version Studio or CommandLine (to do an automated deployment). You must always deploy from a studio that is the same Talend release version as the MDM server. However, for environments where maintaining the state of the hub is important, you may also wish to migrate the contents of the hub—not just the business data in your MDM model(s), but also things like:
To understand the problem further, some history is required. In v5.1 and earlier, this data was stored in an XML database—Qizx or eXist.
In v5.2, Talend switched the storage of the following MDM Containers to a relational database:
The key point is that the MDM models that you build and deploy now map to a physical, normalised, human-readable schema in the Relational Database, rather than schema-less documents in a document database. The process of how this is achieved is out of scope for this article.
The remaining internal data stores/containers, collectively known as the SYSTEM database, continued to be stored in an XML database in v5.2. In v5.3, the SYSTEM database migrated to relational storage as well.
In v5.2 and v5.3 a mechanism was required for migrating from the XML database to the Relational databases. This was provided in the form of the MDM DB migration tool. It is this tool and process of migrating from XML to relational that is documented in the official migration guide in the MDM section, see Automatically migrating from Talend XML database to a relational database or between two relational .... However, the process documented can be considered incomplete at this time.
So, given that MDM uses a relational database for its storage, during an upgrade why can you not just point the new MDM server to the existing databases (or a clone of the existing databases)? This is absolutely not a supported approach for customers to take. On rare occasions, Talend Professional Services may use this approach with the prior approval of Talend R&D, but the circumstances where this is both possible and necessary are extremely limited. The reasons for this are as follows:
Given that Talend reserves the right to change the schema, migration must occur using the application layer, as opposed to directly within the physical storage. As the storage is a 3rd normal form manifestation of the MDM model, it would be nearly impossible for the tool to alter the database ‘on the fly’. Instead, you must go through the application layer, as the entity definition within the model (and therefore the XML representation of an instance of an entity) as used by the application layer will remain identical between versions. The application layer actually has no concept of how the entity is physically stored—it just understands model definitions, XML documents, and relations between XML documents (foreign key relationships in the model).
These instructions continue in MDM Version Upgrade Methodology Part 2, which defines the assumptions, prerequisites, and supported approaches to MDM migration.
 MDM does not currently have the concept of an MDM artifact (binary) that can be properly versioned in a Nexus repository in line with the other developed artifacts (such as Jobs and routes) that can be built using Talend. This is a feature request: https://jira.talendforge.org/browse/PMMDM-261, but not a critical one due to the fact that an MDM publish event is much less common than DI or ESB.
 This article uses the term database in the manner accepted by most databases, except in sections dealing specifically with Oracle, where schema will be used.