MDM Version Upgrade Methodology Part 2

Note: This article is a continuation of MDM Version Upgrade Methodology Part 1.

 

Assumptions

  1. The user of this methodology fully understands how to install and configure a Talend MDM server or cluster. This includes:

    1. Performing the install (only using the Talend installer)
    2. Configuring database connectivity using datasources.xml and mdm.conf, and especially how the settings in these files map the various containers to physical relational databases, whether pre-existing or created by MDM
    3. How MDM can create databases (non-Oracle) using the init section in datasorces.xml
    4. Configuring memory parameters
    5. Configuring port bindings
    6. Optional: Configuring SSL
    7. Usage of the <schema-generation> modes on MASTER, STAGING and SYSTEM databases, especially CREATE mode to clear down these databases.
    8. Usage of the ${container} notation, plus prefixes and postfixes and the implications for the databases required. For example, if you have only one model deployed, called myCustomer, and you configure your datasources.xml file to create TMDM_${container}_MASTER_641 and TMDM_${container}_STAGING_641, then the physical databases created will include:

      1. TMDM_myCustomer_MASTER_641
      2. TMDM_myCustomer_STAGING_641
      3. TMDM_UpdateReport_MASTER_641
      4. TMDM_CrossReferencing_MASTER_641
      5. Plus whatever has been specified for the system database
    9. Cluster: Configuration of an external JMS queue
    10. Cluster: Configuration of Hazelcast for Auto Increment generation
    11. Cluster: Configuration of Tomcat
    12. Cluster: The provision of a suitable load balancer
  2. A suitable server is provided to install the new MDM server. This could be:

    1. The existing server, if enough resources such as memory are available to allow both instances of Talend MDM to run side by side (if data migration is required). Obviously in this case a different port binding would be chosen for the new server so it doesn’t conflict with the old server.
    2. A new server that can communicate with the existing server over a specified TCP/IP port—usually 8180 but this is fully configurable according to preferences (again, if data migration is required).
  3. Suitable databases are created (or DBA rights are given to create databases, which is unusual in a production environment). If you are not familiar with which databases to create, you should not continue with this procedure. Best practice is usually to include a version number in each database name, for example MDM_PARTY_MASTER_561 or MDM_SYSTEM_561, or to have a different database server instance per MDM server and version.
  4. The machine hosting the new server also has a significant amount of free memory to allow the DB migration tool to be run (if data migration is required).
  5. Suitable backups of all existing MDM databases have been made and can be restored if required. This includes a backup of the XML database if migrating from 5.2 or earlier.
  6. If data migration is required, no changes should have been applied to any existing MDM models that have not been properly manifested in the underlying physical storage on the old version of Talend MDM. Model changes are a complex topic in their own right, and version migration is not the right time to address these issues. Talend Professional Services can assist with this topic. Failure to observe this point is a common cause for MDM migration failures. Talend recommends that you deploy the existing model on a sandbox machine of the same version/patch level as the current MDM server, and compare the schema DDL to the existing environment. This will highlight if any model changes have been made that have not been properly manifested in the existing physical schema.
  7. For environments where a full migration (including data) is not required, for example development and test environments as opposed to Production or Production mirror, it is expected that there will be a Job or set of Jobs that load data into the MDM hub to get it back to a known state. If these jobs do not exist, this is a sign of poor development/project practice.

IF YOU ARE UNSURE ABOUT ANY OF THESE ISSUES, ENGAGE WITH A SUITABLE EXPERT FROM TALEND PROFESSIONAL SERVICES BEFORE ATTEMPTING MIGRATION.

 

General migration prerequisites (all approaches)

  • Define the target 6.x architecture and ensure license compliance.
  • Ensure product prerequisites are met by target architecture—OS Versions, Database Versions, machine resources, and so on.
  • Identify scheduled ETL tasks that run as part of MDM process (related to TAC upgrade strategy). Be prepared to pause these tasks and arrange environment downtime.
  • Identify and disable any real-time services that interact with Talend MDM for the duration of the migration.
  • Identify the correct set of MDM artifacts (there are different versions/names/objects deployed to different MDM servers in the various environments).
  • Confirm that no ETL logic is performing a direct read of the MDM database. If it is, flag these jobs as potentially requiring rework.
  • Migrate relevant ETL jobs to v6.x metadata if migrating from v5—see the MDM input/output component documentation. Obviously, recompilation dictates regression testing. This is an ideal opportunity to add Talend Test Cases to your Jobs/services/routes.

 

Simple Migration: Just the objects, no data, possibly users

This is the simplest form of migration. It does not require old and new servers to run at the same time. However, MASTER, STAGING and Journal data will be lost (unless you have DI Jobs to reload an empty hub).

  1. Ahead of the migration, export a copy of the MDM objects (such as model and views) from Studio, and test deployment on a sandbox environment (a new Talend version including any MDM patches you plan to deploy). This test will determine if the objects deploy successfully on the new version—this is not always the case, as each subsequent version often improves validation of the model to ensure that only correct models are deployed. Just because the objects deploy on one version does not mean they will deploy on a newer version, if correct practices have not been observed. If the objects do not deploy on the sandbox, they will need to be corrected before being deployed on the new environment. This could involve a ‘model change’ scenario, which is out of scope for this document.
  2. Plan the migration for the whole platform as usual, including required hardware, databases, networking/firewall rules, temp licenses, and so on.
  3. Perform the install. Ideally, this will be into a sandbox environment first, though this may not always be possible. Follow standard procedures for platform migration.
  4. If you need to keep the existing MDM users, export the contents of the PROVISIONING container from the existing server using Studio (Export content from MDM Server). Depending on your version, it may also be possible to do this export using the CommandLine command mExportDataContainer. [1]
  5. If you need to keep existing hierarchy definitions, export the contents of the SearchTemplate container, again using Studio or CommandLine.
  6. Use the client–server reconciliation feature in the old Studio, if available, to see what is actually deployed to the MDM server, and keep this list for later. It is usually best to use a 'clean' (empty) Studio project to do this. If following good practice, this list will already be well defined for any environment other than Development. It is important to have a good understanding of the release management and deployment strategy here, especially in relation to MDM model changes. For more information, see Assumptions.
  7. To begin the MDM migration, you will need:

    1. All new Talend software installed
    2. TAC configured
    3. Ablility to log in to the migrated existing MDM project using the new version of Studio (usually Development environment only)
    4. For non-Development environments, availability of Studio or CommandLine, to be used for deployment to MDM
  8. Deploy all MDM objects that were previously deployed on the old version to the new MDM Server using the list you previously compiled. This can be the opportunity to clean up unused objects by not deploying them to the new environment. The complete list of object types that can be deployed to MDM are as follows:

    Name

    Deprecated on version 6.x?

    Custom Layout

    No

    Data Container

    No

    Data Model

    No

    Process

    No

    Trigger

    No

    Jobs

    No (be careful to deploy only Jobs designed to execute on the MDM server, for example a Job that is called by a Trigger)

    Match Rule

    No

    Menu

    No

    Resource

    No—but rarely used

    Role

    No

    Service Configuration

    Yes

    Stored Procedure

    Yes

    View

    No

    Workflow

    No

    Versions

    Yes

    Sync Plans

    Yes

     

  9. If the project contains any deprecated items, contact Talend Professional Services.

  10. If required, import the old contents of the PROVISIONING or Search Templates containers into the new server (using the Studio option Import Content to MDM server) to get the users and hierarchies back in the system.
  11. Run any DI load jobs to reload the hub.
  12. TEST!

 

Full Migration: Migrate everything

Before you begin

  • For the migration to run successfully, you must be upgrading to v6.3.1 (with a patch available from Talend support) or v6.4.1 (or higher).
  • All changes to the source MDM system must be frozen. No users, Jobs, or services should be accessing either old or new servers during migration.
  • The migration will transfer records at approximately 500 records a second. The following formula will allow you to calculate downtime:

    No. of records in all user containers to be migrated + No. of journal entries

                                                500

                   = time in seconds predicted for migration downtime

                   For example:

                   1,000,000 + 1,500,000    = 5000                 (5000 seconds = ~83 minutes)

                                 500

     

    If the amount of time taken is problematic due to volumes, Talend Professional Services can provide an alternative approach.

     

  • If you are migrating from v5.6.1 or 5.6.2, you are required to install a patch on the v5 server for the migration to be successful. See https://jira.talendforge.org/browse/TPS-970 for v5.6.1 or https://jira.talendforge.org/browse/TPS-971 for v5.6.2.
  • If migrating from v6.0.1, you may also need a patch for the migration to proceed—contact Talend Support.
  • To migrate from a version prior to 5.6.1, you may need to first migrate to v5.6.1. It is recommended that you contact Talend Professional Services for assistance with this. Direct upgrades from v5.5 may work (and have been tested to work in real customer scenarios) but the ability to do a direct upgrade is not guaranteed by Talend R&D.
  • Should a clustered environment be the target, the cluster should be configured and all nodes should be running. Only one server will be used as the migration target.
  • Ensure the old server is ‘healthy’ and that there are no errors in the MDM logs.
  • Plan the migration for the whole platform as usual, including required hardware, databases, networking rules, temp licenses, and so on.

 

Method

  1. As with the Simple Method, it is recommended that you deploy the MDM model and other objects to a sandbox before attempting to do the migration itself.
  2. On the Target server:

    1. Stop the MDM service, if running
    2. Edit mdm.conf and set subscription.engine.autostart=false
    3. Edit log4j.xml and add the following log category:

      <category name="com.amalto.core.server.routing">
       <priority value="FATAL" />
      </category>
    4. Restart the MDM server
  3. Start the migration. Ideally, this will be into a sandbox environment first, to test that the dbmigration tool is able to run successfully. However, this may not always be possible, so follow standard procedures for platform migration. Both MDM servers need to be running and able to communicate with each other. A v5 to v6 migration will communicate from the new server to the old server over the JNDI port (1199 by default) . A v6 to v6 migration will communicate over the HTTP or HTTPS ports (8180 and 8443 respectively, by default).
  4. Bringing up the new MDM server, you can either:

    1. Configure it with the rights to create databases, or
    2. Create the correct set of databases in advance, before the server is started[2]. Oracle must have schemas created in advance[3].
  5. The target MDM databases (including system) should be empty, apart from the default constructs created on first start-up of the server. In other words, the new databases (if they exist yet) should not be clones of the old ones and no objects should have been deployed. If this is not the case, clean them up manually using database tooling or scripts—delete all objects from all MDM databases (including the system DB) while the server is not running.
  6. Use the DB migration tool to migrate your MDM configuration and data from the old version to the new version. Obviously, use the version of the tool installed with the NEW server and not the old. To do this:

    1. Configure the dbmigration.properties file:

      In Talend_Root/mdm/tools/dbmigration, you will find a file called dbmigration.properties.template. Copy this file and call it dbmigration.properties.

    2. Edit the file and configure it as follows:

      V5 to V6

      v5tov6.png

       

      V6 to V6

      v6tov6.png

       

      You may need to change the users/passwords if they are not still the defaults. The v6 servers require a user with the administrator and System_Admin roles. The v5 server uses the special deployment user. The to.home parameter is the root MDM install folder.

    3. Run the dbmigration tool using the following commands:

      Windows: dbmigration.bat dbmigration.properties -i
      Linux: ./dbmigration.sh dbmigration.properties -i

      This will run the dbmigration tool in interactive mode.

    4. The tool will ask you if you wish to migrate the database. Select Y.
    5. Now it will ask you about each container in turn (mostly system containers). For specific containers, you will be asked about each entity in turn. Answer yes to all containers except:

      Name

      Description

      amaltoOBJECTSFailedRoutingOrderV2

      Failed events (adds no value post-migration and potentially slow to migrate)

      amaltoOBJECTSLicense

      The license

      amaltoOBJECTSCompletedRoutingOrderV2

      Successful events (adds no value post-migration and potentially slow to migrate)

      MDMItemsTrash

      The recycle bin (optional—keep if project requirements dictate it be maintained)

    6. Exclude any user models that you do not wish to migrate.
    7. Check the process and the dbmigration.log file for any errors. Occasionally, roles may fail to migrate, but this can be addressed using the PROVISIONING container technique detailed earlier in this document.
    8. Check the MDM logs on source and target servers for errors.
    9. Check the counts of all records to ensure that the old and new servers reconcile.

    Certain containers may take time to migrate (see calculation above).

  7. On the Target server:

    1. Stop the MDM service if running
    2. Edit mdm.conf and set subscription.engine.autostart=true
    3. Edit log4j.xml and remove the following log category:

      <category name="com.amalto.core.server.routing">
       <priority value="FATAL" />
      </category>
    4. Restart the MDM server
    5. Optional: If running the two MDM servers on the same machine, you may need to shut down the old server and assign the ports from the old server to the new one.
  8. Deploy any Jobs that should be deployed to MDM.
  9. Deploy any workflows that should be deployed to MDM (workflow instance data migration is out of scope for this document, see the Bonita documentation if you require this). Depending on the version you are upgrading from, changes may be required to make workflows function.
  10. If you will be storing pictures on your MDM server (which is not recommended), follow the steps detailed in Moving the pictures and web resources in the Talend migration guide.

 

Special Case (advanced)

Use this process if the model contains self-referencing entities. These are entities that refers to themselves using a foreign key relationship.

Before running the DB migration tool

  1. Deploy the model and views onto the new server (deploy the model, then the views, then the model again).
  2. Disable constraints in the MASTER database. For example, on the SQL server, run the command:

    EXEC sp_msforeachtable 'ALTER TABLE ? NOCHECK CONSTRAINT all'
  3. Run the DB migration tool using steps above.
  4. Enable constraints in the MASTER database. For example, on the SQL server, run the command:

    EXEC sp_msforeachtable 'ALTER TABLE ? WITH CHECK CHECK CONSTRAINT all'

 

 

[1] Why shouldn't you use this technique to migrate all containers, including both user and system containers? Studio has a 10k record export limit, so this is usually not viable.

[2] On a SQL server it is a good idea to create the databases in advance. This allows you to run a script that will avoid deadlock issues:

ALTER DATABASE <database name>
   SET READ_COMMITTED_SNAPSHOT ON
   WITH ROLLBACK IMMEDIATE;

If you don’t create the databases in advance, the migration will generally still work, but this SQL should be run post-migration.

[3] On Oracle, you do not use the ${container} notation. You hard code the schemas in datasources.xml. Therefore, Oracle only supports one MDM model per MDM server.

Version history
Revision #:
14 of 14
Last update:
‎09-19-2017 04:41 PM
Updated by:
 
Labels (1)
Contributors
Tags (1)