Multi-Tenant Elasticsearch Index Names Support in DXP 7.2 FP8/SP3 and DXP 7.3+: Important Changes for Search Tunings and Workflow Metrics

Environment

  • Liferay DXP 7.3 and 7.4
  • Liferay DXP 7.2 Service Pack 3+/Fix Pack 8+ (SP3+/FP8+)
  • Elasticsearch 6.x/7.x
  • Liferay Connector to Elasticsearch 7 v3.1.0 (for DXP 7.2)

Who should read this article?

  • Administrators managing existing DXP 7.2 deployments, especially where any of the following features are in use: Search Tunings (Result Rankings, Synonym Sets), Workflow Metrics and planning to move to SP3+/FP8+ patch level or to upgrade to DXP 7.3/7.4
  • Solution architects designing new DXP 7.2 deployments on SP3+/FP8+ or on DXP 7.3/7.4

Table of Contents

Introduction

The DXP search framework creates its Company Indexes (the indexes created for the Virtual Instance) with a prefix. The default prefix is liferay-, so normally you can find indexes in Elasticsearch with names like

liferay-0
liferay-20101

where 20101 is the companyId of the given Company in your the database. It is displayed as "Instance ID" in the UI and represents the Virtual Instance.

Since DXP 7.0 it is possible to customize the index name prefix using the "Index Name Prefix" (or indexNamePrefix if it's done through a .config file) property of the Elasticsearch 6/7 connector configuration in System Settings.

The Index Name Prefix configuration is important if you plan to build a multi-tenant Elasticsearch cluster to serve Liferay DXP deployments.

What is a multi-tenant DXP-Elasticsearch stack?

A DXP-Elasticsearch stack is multi-tenant if a single Elasticsearch cluster hosts the indexes of multiple DXP deployments. These deployments can be different production deployments or different environments of the same deployment.

Consider the following scenarios:
A.) One Liferay DXP deployment (single node or clustered) with multiple Virtual Instances
B.) Multiple DXP deployments (single or clustered), each with one or more Virtual Instances

No matter which architecture you have, you need a standalone Elasticsearch.

In the case of B.) you could set up a separate Elasticsearch cluster for each DXP deployment, but you may prefer to maintain a single Elasticsearch cluster. You need a way to differentiate the indexes and avoid mixing data from different deployments if the companyId of the multiple instances is the same.

If you have two DXP deployments, provide unique values to the "Index Name Prefix" Elasticsearch 6/7 connector configuration:

  • Public DXP instance: indexNamePrefix=liferay-dxp-public
  • Private DXP instance: indexNamePrefix=liferay-dxp-private

Now consider case A): Liferay DXP ensures that the companyId will be unique for each Virtual Instance within a deployment. Like in case of B), you may want to connect your TEST, UAT or DEV environments to the same Elasticsearch cluster that serves your production deployment (Liferay recommends against this). If the production database is replicated and used in these internal environments, the companyId will be the same. There is a risk of mixing data due to the matching index names and you want to avoid this.

The solution is to use the "Index Name Prefix" configuration:

  • indexNamePrefix=prod-
  • indexNamePrefix=uat-,
  • indexNamePrefix=test-,
  • indexNamePrefix=dev- etc.

Multi-tenancy in DXP 7.0 and DXP 7.1

As mentioned earlier, all you have to do on these versions is to configure the desired prefix in your Elasticsearch 6/7 connector configuration through the indexNamePrefix property.

For a new deployment, the indexes will be created with the specified prefix at the first startup. For an existing deployment, you must perform a full reindex to drop and re-create the indexes with the new name.

Multi-tenancy in DXP 7.2 prior to SP3/FP8

In DXP 7.2 we introduced new APIs so developers can create indexes in Elasticsearch through Liferay DXP's Search API.

The first internal feature leveraging this was Workflow Metrics (available as of DXP 7.2 GA1). Later, Result Rankings and Synonym Sets were also built on these APIs (available as of DXP 7.2 SP1/FP2). That means there are now multiple application-specific indexes separate from the default Virtual Instance (aka. company) indexes. (See What are the out-of-the-box indexes created in Elasticsearch on DXP 7.2 SP2 and prior patch levels? for the list of out-of-the-box indexes on DXP 7.2 SP2/FP5.)

We call these indexes application-driven or application-contributed indexes. These indexes are managed differently than the company indexes: they are not dropped and re-created upon a full reindex and they operate with a different set of settings and mappings*. In addition, features like Result Rankings utilize the search index as a primary data storage: there are no database tables for the ranking entries.

Prior to SP3/FP8 patch level, the previously described "Index Name Prefix" is not applied on the out-of-the-box app-driven indexes. In addition, some of these indexes are "multi-company", which means they store data from multiple Virtual Instances and the documents utilize a companyId or index field to separate the data.

*: There is a Feature Request to allow configuration for app-driven indexes.

Multi-tenancy in DXP 7.2 SP3+/FP8+ and DXP 7.3+

We understand the importance of supporting multi-tenant deployments, so the Search and Workflow product teams collaborated to introduce the necessary changes in DXP 7.2 SP3/FP8, 7.3 and 7.4 to apply the prefix and to create per-company app-driven indexes.

What's changing in the app-driven indexes in DXP 7.2 SP3/FP8

The table below provides a summary of changes that affect all deployments where the default "Index Name Prefix" configuration is in use:

Old Index Name (SP2) Index Name (SP3+/FP8+) Requires Reindex Notes
liferay-search-tuning-rankings liferay-<companyId>-search-tuning-rankings No There will be a separate Result Rankings index created for each Virtual Instance. The old index is deleted automatically after the data has been migrated to the new per-company index(es) upon next startup.
liferay-search-tuning-synonyms-liferay-<companyId> (As of SP2/FP5) liferay-<companyId>-search-tuning-synonyms No Just like the old one, the new index acts as a "helper" index to preserve the Synonym Sets across full reindex operations. Synonyms primarily are stored inside the index settings of the given Virtual Instance index in Elasticsearch. You can delete the old index after verifying that your Synonyms Sets appear in the Search Tuning admin.
workflow-metrics-* liferay-<companyId>-workflow-metrics-* No There will be a separate index created for each Workflow Metrics index type for each Virtual Instance. You can delete the old indexes after verifying the data.

If you are interested in how the existing data gets migrated to the new indexes, please read this. Refer to the What are the out-of-the-box search indexes created in Elasticsearch on DXP 7.2 SP3+/FP8+? section in the Questions & Troubleshooting below for the complete list of of app-driven indexes in DXP 7.2 SP3/FP8.

Before moving to DXP 7.2 SP3+/FP8+ patch level

Prerequisites : DXP 7.2 SP3/FP8 requires to install a new version of the Connector to Elasticsearch 7 (v3.1.0).

If you were using any of the features mentioned above in Liferay DXP 7.2 on a patch level prior to SP3/FP8, it is recommended to create a snapshot of the old indexes so you can restore them later if necessary before moving to the SP3+/FP8+ patch level. You can use the Snapshot and Restore feature in Kibana 7.x. If you are on Elastic Stack 6.x, follow the related documentation for Elasticsearch 6.x.

Questions & Troubleshooting

Below we have collected some questions and answers to provide you more information about this topic.

Table of Contents:

What are the out-of-the-box indexes created in Elasticsearch on DXP 7.2 SP2 and prior patch levels?

Assuming you have only one Virtual Instance (aka. company) and you are using the default indexNamePrefix Elasticsearch connector configuration, the index names look like this:

liferay-0 // system index
liferay-<companyId> // company index
workflow-metrics-instances //app-driven index, multi-company
workflow-metrics-nodes //app-driven index, multi-company
workflow-metrics-processes //app-driven index, multi-company
workflow-metrics-sla-process-results //app-driven index, multi-company
workflow-metrics-sla-task-results //app-driven index, multi-company
workflow-metrics-tokens //app-driven index, multi-company
liferay-search-tuning-rankings //app-driven index, multi-company
liferay-search-tuning-synonyms-liferay-<companyId> // app-driven index, as of DXP 7.2 SP2/FP5 (LPS-100272)

Which indexes are affected and how do I back them up before moving to DXP 7.2 SP3/FP8 or after upgrading to DXP 7.3? Is it safe to delete them?

If you are on DXP 7.2 SP3+/FP8+ patch level or recently upgraded to DXP 7.3 and you were using Liferay DXP 7.2 on previous patch levels, you will have the following old indexes left in Elasticsearch:

liferay-search-tuning-synonyms-liferay-<companyId>
workflow-metrics-instances
workflow-metrics-nodes
workflow-metrics-processes
workflow-metrics-sla-process-results
workflow-metrics-sla-task-results
workflow-metrics-tokens

What about liferay-search-tuning-rankings? This index is deleted automatically at startup after the data has been migrated to the new, per-company index.

We recommend creating a snapshot of these indexes so you can restore them later if necessary. You can use the Snapshot and Restore feature in Kibana 7.x. If you are on Elastic Stack 6.x, follow the related documentation for Elasticsearch 6.x.

After verifying that the affected features work properly, you can delete these indexes through the _close and _delete REST APIs of Elasticsearch or through the Index Management tool in Kibana.

What are the out-of-the-box search indexes created in Elasticsearch on DXP 7.2 SP3+/FP8+ and DXP 7.3+?

If you have only one Virtual Instance (aka. company) and using the default indexNamePrefix Elasticsearch connector configuration, the index names look like this:

DXP 7.2 SP3+/FP8+

liferay-0
liferay-<companyId>
liferay-<companyId>-search-tuning-rankings
liferay-<companyId>-search-tuning-synonyms
liferay-<companyId>-workflow-metrics-instances
liferay-<companyId>-workflow-metrics-nodes
liferay-<companyId>-workflow-metrics-processes
liferay-<companyId>-workflow-metrics-sla-process-results
liferay-<companyId>-workflow-metrics-sla-task-results
liferay-<companyId>-workflow-metrics-tokens

DXP 7.3

liferay-0
liferay-<companyId>
liferay-<companyId>-search-tuning-rankings
liferay-<companyId>-search-tuning-synonyms
liferay-<companyId>-workflow-metrics-instances
liferay-<companyId>-workflow-metrics-nodes
liferay-<companyId>-workflow-metrics-processes
liferay-<companyId>-workflow-metrics-sla-instance-results
liferay-<companyId>-workflow-metrics-sla-task-results
liferay-<companyId>-workflow-metrics-tasks
liferay-<companyId>-workflow-metrics-transitions

How does the data get migrated from the old indexes to the new indexes?

Result Rankings

There is a BackgroundTask (BT) component called RankingIndexCreationBackgroundTaskExecutor registered at startup. The BT framework executes the task once there is an available executor thread. If the search engine is Elasticsearch the import is triggered and handled by SingleIndexToMultipleIndexImporterImpl. The import is done in 2 phases. First it creates the new indexes in Elasticsearch, one for each Virtual Instance (if they don't exist yet). Then it loads the documents for the given Virtual Instance from the old index (liferay-search-tuning-rankings) and indexes them into the new index through a bulk request. At the end of the import process, if there were no errors it deletes the old index.

Synonym Sets

There is a PortalInstanceLifecycleListener component called SynonymSetIndexCreationPortalInstanceLifecycleListener which is listening on the portalInstanceRegistered event fired for each Virtual Instance (company) at startup. If it doesn't exist yet, it creates the new index in Elasticsearch for the given Virtual Instance, then loads the synonym sets from the index settings of the given Virtual Instance index. Each set is added as a Document to the new index.

Workflow Metrics
There is a PortalInstanceLifecycleListener component called WorkflowMetricsPortalInstanceLifecycleListener which is listening on the portalInstanceRegistered event fired for each Virtual Instance (company) at startup. If it doesn't exist yet, it creates the new index in Elasticsearch for the given Virtual Instance through the BaseWorkflowMetricsIndexer and the actual indexers for each Workflow Metrics index type. Then it performs a reindex on these indexes to re-create the documents based on corresponding records from the database.

Can we revert to an old patch level (prior to SP3/FP8)?

Partially: there is no mechanism to move data from the "new" per-company indexes back to the old, multi-company index for Result Rankings. For Synonyms and Workflow Metrics, a full reindex is necessary.

If you created a snapshot of the old Result Ranking index, you can restore it and use Liferay DXP 7.2 on the old patch level, but keep in mind that any new ranking entries you may have created on SP3+/FP8+ patch level will be missing from your snapshot. You need to re-add them through the UI or create the index documents manually through the Reindex API of Elasticsearch.

Do we need to perform a full reindex after moving to SP3/FP8 patch level?

As described above, these changes do not require a full reindex. However, SP3/FP8 may introduce other heavy changes that do require reindexing all search indexes. Please consult with the Highlights of SP3 and the Important Changes of FP8 for details.

How to verify in Elasticsearch that all documents have been migrated to the new indexes?

Use the _search REST API of Elasticsearch: invoke it through curl or the Dev Tools Console in Kibana to query the number of documents belonging to a certain Virtual Instance (company) from the old "multi-company" indexes (in the case of Result Rankings and the Workflow Metrics indexes).

Here is an example :

  1. Go to Control Panel - Configuration - Virtual Instances
  2. Obtain the companyId (referred as "Instance ID" on the UI) for each of your instances. For example, you have one with the ID 20101.
  3. Run the following query in Elasticsearch:
GET workflow-metrics-instances/_search
{
  "query": {
    "term": {
      "companyId": {
        "value": "20101"
      }
    }
  },
  "size": 0
}

(Or as a curl request:

curl -XGET "http://localhost:9200/workflow-metrics-instances/_search" -H 'Content-Type: application/json' -d'{  "query": {    "term": {      "companyId": {        "value": "20101"      }    }  },  "size": 0}'

The result looks something like this:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

The response shows that there are 2 documents for this company inside the workflow-metrics-instances index. You can do the same with other workflow-* indexes.

For Result Rankings, the name of the field to search in is index and the value is liferay-<companyId>, for example liferay-20101 in this example:

GET liferay-search-tuning-rankings/_search
{
  "query": {
    "term": {
      "index": {
        "value": "liferay-20101"
      }
    }
  },
  "size": 0
}

Now that you have the old document counts, you can query the new, per-company indexes for Result Rankings (liferay-20101-search-tuning-rankings) and Workflow Metrics (liferay-20101-workflow-metrics-*) the same way. The document counts should match.

Note: If you were on the SP2/FP5 patch level already, the synonym sets are already stored in a per-company index. Prior that patch level there is no index for synonyms.

How to change the Index Name Prefix configuration on existing deployments?

  • Shut DXP down, then edit the index name prefix via a .config file.

  • Perform a full reindex after the next portal startup. This creates the indexes with the new prefix and reindexes the system index, the Virtual Instance indexes, and the Workflow Metrics indexes. Old indexes are not deleted.

  • Manually migrate existing data (documents) from the Result Rankings and Synonyms indexes to the new indexes and update the index field in each of the existing Result Rankings documents to point to the corresponding index (with the new name) for the given Virtual Instance (like "index": "liferay-20101"-> "index": "dxp-20101"). Without these extra steps, existing synonym sets and ranking entries will be missing from the new indexes and thus they won't be applied on search queries.

To migrate the data and update the index field for Result Rankings, you can use the Reindex API of Elasticsearch. If changing the indexNamePrefix property from liferay- to dxp- the API call would look like this:

POST _reindex/
{
  "dest": {
    "index": "dxp-20101-search-tuning-rankings"
  },
  "source": {
    "index": "liferay-20101-search-tuning-rankings"
  },
  "script": {
    "lang": "painless",
    "source": """
            ctx._source.index = 'dxp-20101';
          """
  }
}

The example response from Elasticsearch below shows that it created 1 document in the new index:

{
  "took" : 13,
  "timed_out" : false,
  "total" : 1,
  "updated" : 0,
  "created" : 1,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

To do the same for Synonyms, replace rankings with synonyms in the source and dest index names, remove the script block and execute the API call again.

How does it affect custom code and projects?

If you have custom code that relies on searching in the specified old indexes, you should update your modules to query the new indexes.

How does it affect DXP 7.3+?

The necessary changes are included in DXP 7.3 GA1. After upgrading from DXP 7.2 SP2 or prior patch level to DXP 7.3 or 7.4, the new indexes will be created automatically (just like on DXP 7.2 SP3+) and you will need to perform a full reindex, which is a standard action after upgrades. The old indexes can be deleted after the upgrade is successfully completed.

Does it affect DXP 7.1 and 7.0?

No. There are no application-driven indexes out-of-the-box in DXP 7.1 and DXP 7.0.

Can these changes be delivered in a Hotfix?

No.

Does it affect Solr?

No. Search Tunings and Workflow Metrics are not supported on Solr and the underlying Search APIs for index creation and management are not implemented in the Solr search engine connectors.

  • LPS-117702: Multi-Tenant Elasticsearch Index Names
  • LPS-117703: Synonyms index: follow standard naming pattern
  • LPS-117704: Result Rankings index: follow standard naming pattern
  • LPS-117706: As a Portal Admin, I want Workflow Metrics to create an index per company 7.2
  • LPS-118286: As a Portal Admin, I want Workflow Metrics to follow the Multi-Tenant index naming convention
  • LPS-101291: Results Rankings index should be per virtual instance, not shared
  • LPS-100272: Reindexing search indices deletes all Synonym Sets
Was this article helpful?
2 out of 2 found this helpful