Augmented search FAQs
Augmented Search UI
Styling
How can I customize the styling of the Augmented Search UI?
Instant Search
How can I change the number of letters that trigger instant search?
How can I deactivate instant search?
Facets
Can I use hierarchical facets with categories?
Are facets conjunctive or disjunctive?
Can I index custom properties of type categories?
Search box
Can I use quotes to search for exact matches (for example, “my exact search”)?
Can I use boolean operators in the search box, for example AND, OR, or NOT?
Is autocomplete supported on terms that are most searched?
Highlighting
Elasticsearch and indexing
Infrastructure
Can I use the same Elasticsearch cluster for different Jahia platforms?
Indexation
Can I index content that is not displayed in full page?
Can I boost a custom property of a content type in the definitions.cnd file?
Can I exclude some content from being indexed, like the Google no-index option?
Can I add a negative boost on some content?
Can I set a negative boost for some content types?
Can I define boosts by website?
What analyzer does Augmented Search use?
Can I use different analyzers?
Which language is used for lemmatization?
How can I display the facet for a custom field?
How can I display categories facets?
Can I add facets on custom properties? Or are they added automatically?
Can I customize Elasticsearch settings and mappings?
How is the fuzzy match configured?
How content is indexed in augmented search?
Definitions.cnd and Elasticsearch configurations
Legacy components
Does the default behavior display all matching results or can I set a default limit?
Upgrading from JCR search
Can I have one site running with JCR search and another one with augmented search?
External Data Provider (EDP)
How can I index content from the External Data Provider?
Answers
How can I customize the styling of the Augmented Search UI?
You can customize the Augmented Search UI module by applying CSS from another template that defines the styling that you want to use. You can also fork the Augmented Search UI module.
How can I change the number of letters that trigger instant search?
In the Augmented Search UI module and in the SearchView.jsx
file, modify the value for the debounceLength
parameter in the SearchBox component.
<SearchBox
searchAsYouType
debounceLength={100}
/>
See search-ui SearchBox documentation for more information on the SearchBox component and parameters.
How can I deactivate instant search?
Just remove searchAsYouType
from the SearchBox component in the SearchView.jsx file.
<SearchBox/>
See search-ui SearchBox documentation for more information on the SearchBox component and parameters.
Can I use hierarchical facets with categories?
In Augmented Search 1.0, hierarchical facets are not supported.
Are facets conjunctive or disjunctive?
Facets are disjunctive by default. Note that disjunctive facets enable users to select one or more facets to filter search results and conjunctive facets allow users to select only one.
Can I index custom properties of type categories?
No, only j:defaultCategory
from jmix:categorized
are indexed as such.
Can I use quotes to search for exact matches (for example, “my exact search”)?
No, using quotes to get an exact match is not supported by augmented search. However, if a search result matches the exact order of the searched words, it will be boosted automatically.
Can I use boolean operators in the search box, for example AND, OR, or NOT?
No, boolean operators are not supported by augmented search.
Is autocomplete supported on terms that are most searched?
No, autocomplete is usually performed on terms that are the most relevant, rather than terms that are most used.
How does the highlighting work? Is there a match between the language used for lemmatization and the highlighting?
Yes.
Can I use the same Elasticsearch cluster for different Jahia platforms?
Yes, you can add a prefix for each index, so one prefix per platform.
Should I use content-based search or page-based search? Can I combine page-based search with content-based search?
Yes, depending on your site and business requirements, you can configure one part of your website with page-based search, by using filter on path. Then, you could index the rest of the website using content-based search.
Can I index content that is not displayed in full page?
Yes. Content is indexed by node type and sub type.
Can I boost a custom property of a content type in the definitions.cnd file?
No, it’s not possible to boost custom fields.
Can I exclude some content from being indexed, like the Google no-index option?
This is not possible with Augmented Search 1.0.
Can I add a negative boost on some content?
This is not possible with Augmented Search 1.0.
Can I set a negative boost for some content types?
This is not possible with Augmented Search 1.0.
Can I define boosts by website?
No, indexation is done at the platform level and all sites are affected.
Yes. You can configure synonyms using standard Elasticsearch configuration.
What analyzer does Augmented Search use?
Each language uses its own index and dedicated analyzer.
Can I use different analyzers?
Yes. You can configure analyzers and stemmers by modifying the OSGI properties in the Augmented Search module. You can do this in the configuration file, Karaf console, or in Jahia Tools.
Which language is used for lemmatization?
The indexation process does not use lemmatization by default, as Elasticsearch and Lucene only provide stemming out-of-the-box.
How can I display the facet for a custom field?
In jgql:nodes.audiences.keywords
, add your field to the Elasticsearch mapping. If your field has a namespace, surround the namespace and field in quotes.
How can I display categories facets?
Add the jgql.categories.keyword
to your query. This will create a facet based on the category's title in the index language.
{
jcr(workspace:LIVE) {
searches(siteKey:"digitall", language:"en", workspace:LIVE) {
search(q:"alternative",
facets:{
term:{field:"jgql:categories.keyword", minDocCount:1, }
}) {
facets {
data {
... on TermValue {
count
value
}
}
}
hits {
displayableName
id
}
}
}
}
}
The search will return something like this.
"search": {
"facets": [
{
"data": [
{
"count": 2,
"value": "Categories"
},
{
"count": 1,
"value": "Annual Filings"
},
{
"count": 1,
"value": "Companies"
},
{
"count": 1,
"value": "Goods"
}
]
}
],
"hits": [
{
"displayableName": "Home",
"path": "/sites/digitall/home"
},
{
"displayableName": "Press Releases Entry",
"path": "/sites/digitall/home/investors/press-releases-entry"
}
]
}
Can I add facets on custom properties? Or are they added automatically?
Yes, indexed custom properties are automatically indexed as text or keyword, enabling you to use them for facets.
To use a property as a facet:
- Modify the augmented search configuration by adding the definition types you want to map and index.
- In Jahia Tools, navigate to Administration and Guidance>OSGi console.
- Select OSGI>Configuration and edit values for the org.jahia.modules.augmentedsearch module.
- Edit the following
org.jahia.modules.augmentedsearch.content.indexedMainResourceTypes
andorg.jahia.modules.augmentedsearch.content.mappedNodeTypes
properties. The following example shows adding thejacademix:document
definition type.
- In Jahia, reindex your data.
- In Administration, select Configuration>Augmented search management.
- Click Index the content in the main window. Then click Save.
Now your data can be used in your queries. The following example show how to make a jacademix:document
mixin a main resource to be searched. The example also shows how to map it so that the author
property can be used for facets.
{
jcr {
searches(siteKey: "academy", language: "en", workspace: LIVE) {
search(q: "cluster", limit: 20, offset: 0,
filter: {nodeType: {type: "jacademix:document"}},
facets: {
term: [
{field:"author.keyword", minDocCount:1}]}) {
totalHits
took
facets {
field
type
data {
... on TermValue {
count
value
}
}
}
hits {
id
link
displayableName
excerpt
score
lastModified
lastModifiedBy
createdBy
created
}
}
}
}
}
Can I customize Elasticsearch settings and mappings?
Yes you can. To customize Elasticsearch settings and mappings:
- Copy the embedded files from augmented-search modules. Copy the
mapping.json
andsettings.json
files fromMETA-INF/configurations
to a location where they can be referenced by your Jahia. - Then, update the configuration file to reflect the new paths to the files.
- There is a property for the settings and the mapping. Each property can be specified for both content and files, so this gives the following four properties.
org.jahia.modules.augmentedsearch.content.settingsFileLocation org.jahia.modules.augmentedsearch.file.settingsFileLocation org.jahia.modules.augmentedsearch.content.mappingFileLocation org.jahia.modules.augmentedsearch.file.mappingFileLocation # Example: # org.jahia.modules.augmentedsearch.content.settingsFileLocation = /opt/jahia/elasticsearch/settings.json
First, copy the embedded configuration files. Once you have copied the JSON files, edit the settings.json
file. Locate the tokenizer definition at the end of the file.
"tokenizer": {
...
"main_tokenizer": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 12,
"token_chars": [
"letter",
"digit"
]
},
"metadata_tokenizer": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 12,
"token_chars": [
"letter",
"digit"
]
}
}
Here you can tune the min_gram and max_gram properties.
- min_gram
Specifies when “instant search” applies to searches that your users perform. A value of 1 means that users get results on the first keyboard stroke. A value of 3 means results display when they type at least 3 characters. - max_gram
Determines the length of the maximum groups of letters, by default up to 12 letters. This value depends on your dataset, the complexity of your vocabulary, and the different languages you are going to index. For example, some languages like German tend to compound words together.
Boost settings are applied by default to the jgql:main
, jgql:metadata
, and jgql:content
fields.
#
# Boost settings for fields: jgql:main, jgql:metadata and jgql:content
#
org.jahia.modules.augmentedsearch.field.main.boost = 2.0
org.jahia.modules.augmentedsearch.field.metadata.boost = 1.5
org.jahia.modules.augmentedsearch.field.content.boost = 1.5
How is the fuzzy match configured?
By default, the fuzzy matching starts at the 4th character. Also, it can permute one letter, starting at the 3rd character. The first 2 letters need to be exact.
How content is indexed in augmented search?
All content is split in 3 fields:
- Main
Indexes the displayable name of the content, usually the title or alternatively the 128 first characters if rich-text. By default, the weight = 2. - Metadata
Indexes the categories, tags and keywords that are set on each content. By default, the weight = 1. - Content
Aggregates all full-text properties into one field to provide an efficient full-text search. By default, the weight = 1.
Each of these fields is analyzed and stored in the following subfields to provide the best search relevance out of the box:
- Stemming
Takes the searched term and tries to match it against the stem (for example developer > develop). This subfield applies to all words in your searched term. - Ngram
Edge Ngram analyzes each word and emits a token for each group of letter in the defined limit (1-10) (ex: wolf -> [w, wo, wol, wolf]). This subfield is mainly used when the visitor starts typing words. - Phrase
Matches the searched terms against the indexed content. If the searched terms have a match with the indexed content, then the order of the words has an impact. - Exact match
Checks the exact match between the searched term and the indexed content. Exact match has a lot of weight.
What parameters can I set in the definitions.cnd file to modify the augmented search indexation (for example, nofulltext, indexed=no, boost, analyzer=keyword)?
The query uses the main, content, and metadata fields, which do not take into account boost or analyzer. The properties that are not indexable are not indexed (indexed=no). The properties that are not full text are not copied in the field content and are not part of the query for search, but they can be used for filtering or faceting.
What are the legacy Jahia search-related components that continue to work in an Augmented Search setup (for example, glossary or pager)? If some are not working anymore what are the alternative ones?
No legacy Jahia search components will continue to work. Only the Augmented Search UI component uses the Search UI library from Elasticsearch. See the Elasticsearch documentation for components available for you to use with your search application.
- SearchBox
- Results
- Result
- ResultsPerPage
- Facet
- Sorting
- Paging
- PagingInfo
- ErrorBoundary
- Search results
Does the default behavior display all matching results or can I set a default limit?
The default limit is 10 results if nothing is specified in the GraphQL query.
Can I have one site running with JCR search and another one with augmented search?
Yes. Augmented search is not based on search provider so the JCR search is still available. You can add the Augmented Search UI on one site and not another.
How can I index content from the External Data Provider?
You can use the event API to index content from the External Data Provider. For more information, see Sending events to Jahia.
Is it possible to have search results including contents coming from the External Data Provider and from the JCR?
Yes it is possible, and the search results will be mixed, as if they were from the same content source (as opposed to the JCR search today where the JCR results are displayed before the EDP results).
Related links
- About augmented search
Get an overview of using augmented search and customizing the search experience - Augmented search overview and architecture
Find out about search modules, architecture, and how content is indexed - Installing and configuring augmented search
Learn how to install augmented search and use Elasticsearch to index and search content in your sites - Using the Augmented Search GraphQL API
Find out how to set up and use Jahia’s Augmented Search GraphQL API