Sites built on Liferay DXP often feature lots of content split over lots of asset types. Web content articles, documents and media files, and blogs entries are just a few examples. Most content types in Liferay DXP are assets. Under the hood, assets use the Asset API and have an Indexer class. Any content that has these features can be searched in the Liferay DXP Search application.
Searching the Index, not the Database
Liferay DXP stores its data in a database. You might incorrectly assume that you’re directly searching the database when you use Liferay DXP’s Search application. Liferay DXP instead leverages a search engine for its search capabilities. Using a search engine like Elasticsearch lets you convert searchable entities into documents. Documents are created and added to the index at the same time they’re added to the database. They’re also updated whenever the database is updated, and deleted from the index when the backing entity is deleted from the database. When you enter a search term in Liferay DXP, nothing happens in the database, because the documents are already indexed on the Elasticsearch server, and that’s where the search is executed.
It’s worth adding the complexity of a search engine, rather than searching the database directly, for performance reasons and for some of the features that search engines provide, like algorithms that give you the ability to use relevancy scores. For more technical details, see the Introduction to Search developer article.
Leveraging Elasticsearch in Liferay DXP
The default search engine used by Liferay DXP is Elasticsearch, which is backed by the Lucene search library. There’s an Elasticsearch server embedded in Liferay DXP bundles, which is handy for testing and development purposes. Production environments must install a separate, remote Elasticsearch server (or even better, cluster of servers). For information on how to set up Elasticsearch for Liferay DXP in production, read the deployment guide.
Liferay DXP Search Features
Searching is simple and straghtforward. Find a search portlet (there’s one embedded in every page by default), enter a term in its search bar, and click Enter.
A results page is displayed. If there are hits to search engine documents, you’ll see them as search results in the right hand column. In the left hand column you’ll see search facets.
The search bar, search results, and search facets make up three powerful features in Liferay DXP’s search UI.
Search Bar
The search bar is simple: it’s where you enter search terms. Search terms are the text you send to the search engine to match against the documents in the index. The documents that are returned are where this gets interesting.
Search Results and Relevance
The search term is processed by an algorithm in the search engine, and search results are returned to users in order of relevance. Relevance is determined by a document’s score, generated against the search query. The higher the score, the more relevant a document is considered. The particular relevance algorithm used is dependent on the algorithms are provided by the search engine (Elasticsearch by default).
In short, answers to questions like those below determine the relevance score of a hit (matching document):
- How many times does the search term appear in a document’s field?
- How many times does the search term appear in the same field of all the other documents in the index?
- How long is the field where the term appears?
If the search term appears with greater frequency in the field of one document than is the case for the same field in other documents, the score will be higher. However, if it’s a long field (like a content field for a Blogs Entry document) then the presence of the search term is discounted. Its presence in a shorter field (like a title field) produces a higher relevance score. See the Search Results article for a longer discussion of relevance.
Search Facets
Facets are a core feature of the @prodcut@ Search application.
Facets allow users of the Search application to filter search results. Think of facets as buckets that hold similar search results. You might want to see the results in all the buckets, but after scanning the results, you might decide that the results of just one bucket better represent what you’re looking for. So what facets are included in Liferay DXP by default?
- Site
- Asset type
- Asset tag
- Asset category
- Folder
- User
- Modified date
You’ve probably used something similar on any number of sites, especially with online commerce. You search for an item, are presented with a list of results, and a list of buckets you can click to further refine the search results, without entering additional search terms. Search facets work the same way in Liferay DXP. Facets are, of course, configurable.