Using facets
Facets (also called aggregations) build a particular view over a set dataset. For example, you can see how many documents were created by each user, how many documents fall in a given creation date range, and get the highest value in a set of documents.
Augmented Search supports the following facets:
- termFacet
Builds an aggregation based on a provided field - treeFacet
Similar to a term aggregation but introduces the notion of tree (or hierarchy) based on categories - rangeFacet
Counts the number of document within set ranges - numberRange
Provides min and max values for a set field
termFacet
The termFacet
aggregates documents for each unique value of a specified field. For example, the query below returns the number of documents per author.
query {
search(q: "jahia") {
termFacet(field: "jgql:createdBy") {
data {
count
value
}
}
}
}
disjunctive facets
When allowing visitors to explore a dataset, you often need to enable disjunctive facets, which build an aggregation excluding any existing filters on the facet's current field.
In other words, if your dataset is about movies, you would probably want a "genre" facet:
- If your facet is conjunctive, a visitor clicking on "Comedy" will see results filtered on that particular genre, but will not be able to select another genre in the facet.
- If your fact is disjunctive, a visitor clicking on "Comedy" will see results filtered on that particular genre and how many documents match other genres, such as "Horror". This makes it possible to build user interfaces allowing queries such as "movie genre is Comedy or Horror".
minDocCount
There might be occurrences in which an aggregation can generate a lot of results with only a few documents, while your users might only be interested in the ones with most results. Specifying the minDocCount
parameter only returns results containing at least the specified number of documents. Note that prior to version 3.0, the default value for minDocCount
was 5, compared to 1 in newer versions.
missingValue
In some occurrences, the aggregated field might not have a value, but the absence of a value is still a relevant piece of information. In such a case, you can specify a missing value to return when there is no value in the dataset.
For example, a movie database might have a field detailing the primary country in which the movie is taking place. This field would be empty for movies taking place in space. The resulting query would look like this:
query {
search(q: "brad pitt") {
termFacet(field: "movieCountry", missingValue: "space") {
data {
count
value
}
}
}
}
treeFacet
The treeFacet
node provides a very similar feature but supports hierarchical use cases, for example, building a hierarchy such as Continent>Country>Region>City, to return the number of documents matching each of these elements. You can do this by providing the rootPath
parameter.
Each aggregation then contains a hasChildren
field to indicate whether the value under rootPath
contains children.
{
search(q: "") {
treeFacet(field: "jgql:categories_path.facet", rootPath: "/music") {
data {
count
value
hasChildren
rootPath
}
field
rootPath
}
}
}
Which would return this payload:
{
"data": {
"search": {
"treeFacet": {
"data": [
{
"count": 2,
"value": "classical",
"hasChildren": false,
"rootPath": "/music/classical"
},
{
"count": 2,
"value": "electronic",
"hasChildren": true,
"rootPath": "/music/electronic",
},
{
"count": 1,
"value": "blues",
"hasChildren": true,
"rootPath": "/music/blues",
}
],
"field": "jgql:categories_path.facet",
"rootPath": "/music"
}
}
}
}
In the above payload, notice that both "electronic" and "blues" have children. You could therefore run the query again, passing "/music/electronic" as rootPath
to fetch all elements under "electronic".
rangeFacet
The rangeFacet
created data buckets based on user specified date or number ranges. Support for the date math expression allows you to create of buckets based on fixed dates or automatically generated from today's date.
The query below is an example mixing both fixed dates and date math expressions (although such query has a limited practical use).
query {
search(q: "") {
created: rangeFacet(
field: "jcr:created"
ranges: [
{ from: "2015-01-01", to: "2015-12-31", name: "2015" }
{ from: "2016-01-01", to: "2016-12-31", name: "2016" }
{ from: "now-1y", to: "now", name: "Last year" }
]
) {
data {
name
count
}
}
}
}
This will produce the following result:
{
"data": {
"search": {
"created": {
"data": [
{
"name": "2015",
"count": 0
},
{
"name": "2016",
"count": 137
},
{
"name": "Last year",
"count": 7
}
]
}
}
}
}
Number Range
The numberRange
facet, given a numerical field, provides access to max
and min
values for that field across the entire dataset.