The Data Migration tool uses a collection of Python3 files. 

Install the Data Migration Tool

To install the Data Migration tool, download migration_tool.zip

This example shows the unzipped structure:

tool/ main.py readme.txt requirements.txt src/ tool.json
TEXT

Set Up the python3 Environment

  1. To verify if python3 is installed on your system, enter: 

    which python3
    CODE

    If python3 is not found, then install python3 by entering: sudo yum install python36 -y

  2. To verify if the pip3 package manager is installed, enter: 

    which pip3
    CODE

    If the pip3 package manager is not found, then see Installing Packages and enter: sudopython3 -m pip install--upgrade pip setuptools wheel

  3. To install the libraries which run the migration python script, enter this command within the Data Migration tool directory: 

    pip3 install -r requirements.txt --user
    CODE

Configure the Data Migration Tool

Before you can run the migration script, you must configure it properly. All configuration is stored in the tool/tool.json file in JSON format. These sections provide guidance on how to configure each property.

Clusters

Clusters are defined as a collection of Events Service clusters. Each cluster has the following properties:

PropertiesDescription
api_urlURL pointing to one of the Events Service's API nodes or its load balancer.
certificate_filePath to the PEM file.
check_hostname(Optional) Indicates whether to check the hostname when verifying the certificate. Default is true.
es_urlURL pointing to one of the Elasticsearch's master nodes.
es_url_internalInternal URL pointing to one of the Elasticsearch's master nodes. Used by other clusters for remote re-indexing.
es_versionRelates to the Elasticsearch version.
keysController and OPS keys for Events Service. These keys are located in the conf/events-service-api-store.properties file.

This example defines Events Services: es2es6, and xpack_es6

es6 has SSL enabled, while xpack_es6 has Elasticsearch X-Pack enabled.

{
  "clusters": {
    "es2": {
      "keys": {
        "CONTROLLER": "27410b11-296a-49e1-b2d2-d2371ab94d64",
        "OPS": "45c25bad-636c-432f-b0bd-f8ec428c8db4"
      },
      "api_url": "35.162.126.253:9080",
      "es_url": "35.162.126.253:9200",
      "es_url_internal": "172.31.12.185:9200",
      "es_version": 2
    },
    "es6": {
      "keys": {
        "CONTROLLER": "7db43bff-97d3-4d5e-828a-a2eacb693e07",
        "OPS": "ac7424d1-ae96-4e10-ad82-a2eca50db133"
      },
      "api_url": "https://34.209.245.68:9080",
      "certificate_file": "/Users/jun.zhai/es6.pem",
      "check_hostname": false,
      "es_url": "34.209.245.68:9200",
      "es_version": 6
    },
    "xpack_es6": {
      "keys": {
        "CONTROLLER": "07b055f8-a97b-4ccb-a239-2267d452c4ea",
        "OPS": "c582cc8a-f3bc-419f-a42a-7bb8adad05b8"
      },
      "api_url": "52.89.86.93:9080",
      "es_url": "http://elastic:1234@52.89.86.93:9200",
      "es_version": 6
    }
  }
}
BASH

Migration

This section describes the migration properties:

PropertiesDescription
accounts(Optional) Specify which accounts to migrate. Defaults to everything in source Events Service.
search_hitsMaximum documents fetched in the Elasticsearch query. Default is 5000.
remote_reindex_concurrencyMaximum number of remote re-index tasks launched concurrently. Default is 4.
remote_reindex_scroll_batch_sizeBatch size for remote reindex. Default is 8000. See Re-index API.
reindex_task_polling_intervalFrequency in seconds of how often to check the status of ongoing remote reindex tasks. Default is 60 seconds.
starting_max_fields_per_indexMaximum fields allowed when creating a new index. Default is set to 1000. The value should be same as the ad.es.event.index.startingMaxFieldsPerIndex in conf/events-service-api-store.properties file.

Migration Properties Examples:

Example 1: Migrates everything from source Events Service:

{
  "migration": {
    "search_hits": 5000,
    "remote_reindex_concurrency": 6,
    "remote_reindex_scroll_batch_size": 8000,
    "reindex_task_polling_interval": 60,
    "starting_max_fields_per_index": 1000
  }
}
BASH

Example 2: Migrate all event types in accounts, customer15_9611293a-c56f-4c9a-aa11-9f6bffcb42celog_v1, and custom_event event types in account customer1_229f6fbf-b42f-4d66-a56b-a2324d8b169d. This example does not migrate any other accounts.

{
  "migration": {
    "accounts": {
      "customer15_9611293a-c56f-4c9a-aa11-9f6bffcb42ce": [],
      "customer1_229f6fbf-b42f-4d66-a56b-a2324d8b169d": [
        "log_v1",
        "custom_event"
      ],
    },
    "search_hits": 5000,
    "remote_reindex_concurrency": 4,
    "remote_reindex_scroll_batch_size": 8000,
    "reindex_task_polling_interval": 60,
    "starting_max_fields_per_index": 1000
  }
}
BASH