Using facets

November 14, 2023

Facets (also called aggregations) build a particular view over a set dataset. For example, you can see how many documents were created by each user, how many documents fall in a given creation date range, and get the highest value in a set of documents.

Augmented Search supports the following facets:

  • termFacet
    Builds an aggregation based on a provided field
  • treeFacet
    Similar to a term aggregation but introduces the notion of tree (or hierarchy) based on categories
  • rangeFacet
    Counts the number of document within set ranges
  • numberRange
    Provides min and max values for a set field

termFacet

The termFacet aggregates documents for each unique value of a specified field. For example, the query below returns the number of documents per author.

query {
  search(q: "jahia") {
    termFacet(field: "jgql:createdBy") {
      data {
        count
        value
      }
    }
  }
}

disjunctive facets

When allowing visitors to explore a dataset, you often need to enable disjunctive facets, which build an aggregation excluding any existing filters on the facet's current field.

In other words, if your dataset is about movies, you would probably want a "genre" facet:

  • If your facet is conjunctive, a visitor clicking on "Comedy" will see results filtered on that particular genre, but will not be able to select another genre in the facet.
  • If your fact is disjunctive, a visitor clicking on "Comedy" will see results filtered on that particular genre and how many documents match other genres, such as "Horror". This makes it possible to build user interfaces allowing queries such as "movie genre is Comedy or Horror".

minDocCount

There might be occurrences in which an aggregation can generate a lot of results with only a few documents, while your users might only be interested in the ones with most results. Specifying the minDocCount parameter only returns results containing at least the specified number of documents. Note that prior to version 3.0, the default value for minDocCount was 5, compared to 1 in newer versions. 

missingValue

In some occurrences, the aggregated field might not have a value, but the absence of a value is still a relevant piece of information. In such a case, you can specify a missing value to return when there is no value in the dataset.

For example, a movie database might have a field detailing the primary country in which the movie is taking place. This field would be empty for movies taking place in space. The resulting query would look like this:


query {
  search(q: "brad pitt") {
    termFacet(field: "movieCountry", missingValue: "space") {
      data {
        count
        value
      }
    }
  }
}

treeFacet

The treeFacet node provides a very similar feature but supports hierarchical use cases, for example, building a hierarchy such as Continent>Country>Region>City, to return the number of documents matching each of these elements. You can do this by providing the rootPath parameter.

Each aggregation then contains a hasChildren field to indicate whether the value under rootPath contains children.
 


{
  search(q: "") {
    treeFacet(field: "jgql:categories_path.facet", rootPath: "/music") {
      data {
        count
        value
        hasChildren
        rootPath
      }
      field
      rootPath
    }
  }
}

Which would return this payload:

{
  "data": {
    "search": {
      "treeFacet": {
        "data": [
          {
            "count": 2,
            "value": "classical",
            "hasChildren": false,
            "rootPath": "/music/classical"
          },
          {
            "count": 2,
            "value": "electronic",
            "hasChildren": true,
            "rootPath": "/music/electronic",
          },
          {
            "count": 1,
            "value": "blues",
            "hasChildren": true,
            "rootPath": "/music/blues",
          }
        ],
        "field": "jgql:categories_path.facet",
        "rootPath": "/music"
      }
    }
  }
}

In the above payload, notice that both "electronic" and "blues" have children. You could therefore run the query again, passing "/music/electronic" as rootPath to fetch all elements under "electronic".

rangeFacet

The rangeFacet created data buckets based on user specified date or number ranges. Support for the date math expression allows you to create of buckets based on fixed dates or automatically generated from today's date.

The query below is an example mixing both fixed dates and date math expressions (although such query has a limited practical use).


query {
  search(q: "") {
    created: rangeFacet(
      field: "jcr:created"
      ranges: [
        { from: "2015-01-01", to: "2015-12-31", name: "2015" }        
        { from: "2016-01-01", to: "2016-12-31", name: "2016" }
        { from: "now-1y", to: "now", name: "Last year" }
      ]
    ) {
      data {
        name
        count
      }
    }
  }
}


This will produce the following result:


{
  "data": {
    "search": {
      "created": {
        "data": [
          {
            "name": "2015",
            "count": 0
          },
          {
            "name": "2016",
            "count": 137
          },
          {
            "name": "Last year",
            "count": 7
          }
        ]
      }
    }
  }
}

Number Range

The numberRange facet, given a numerical field, provides access to max and min values for that field across the entire dataset.