The Server Availability Manager (SAM) module supports monitoring in complex containerized environments. SAM extends Jahia's GraphQL API and provides server monitoring, server availability, and maintenance operation functionality. Learn more below about how to use and extend Jahia APIs with complex containerized environments.
Server Availability Manager improves on and replaces the Healtheck module and adds more server availability functionality. As with Healthcheck, SAM offers the ability to develop new probes (it ships by default with 5).
When migrating from Healthcheck to SAM, remember that the data object has been removed from the health probes. These data elements (such as system load, node list, and URLs) are now provided by dedicated nodes of the API rather than the health probe. SAM actually includes nodes dedicated to system monitoring, but those nodes are located outside of the probes.
The SAM module provides insights into your platform's health and triggers alerts when key components need particular attention. Available through GraphQL or REST, you can trigger the module at will with minimal impact on the platform load. You can also develop additional probes to provide more information to the monitoring systems.
Probes are categorized by severity and report a status:
The query below fetches the status of all of the probes with severity LOW or above:
query {
admin {
jahia {
healthCheck(severity: LOW) { # You minimum severity to return
status { # Highest reported status across all probes
health # GREEN, YELLOW or RED
message # Explanation for the health level
}
probes {
name # Name of the probe
status { # Status reported by the probe
health # GREEN, YELLOW or RED
message # Explanation for the health level
}
severity # Severity of the probe (LOW to CRITICAL)
description # Description specified by the developer of the probe
}
}
}
}
}
As mentioned in About monitoring, this module replaces the previous healthcheck module.
Server Availability Manager ships by default with five probes that you can configure by editing the org.jahia.modules.sam.healthcheck.ProbesRegistry.cfg file.
[
{
"karafCommand": "bundle:start-level article 90"
}
]
In the example above module article will have start-level of 90. If you want to simply check start-level execute the same command but without start-level at the end (i. e. 90). You can also send bundle:start-level --help for information. Note that results will be available in Tomcat console.
The module also contains a servlet providing REST (GET) capabilities to Jahia to help you transition from the Healthcheck to the Server Availability Manager module. Accessible at https://[YOUR_JAHIA_HOST]/modules/healthcheck?severity=low, a GET request to this URL returns the list of probes and their values.
You can further customize the configuration by editing the org.jahia.modules.sam.healthcheck.HealthCheckServlet.cfg file.
# default severity level with "?severity=LEVEL" is not provided
severity.default=MEDIUM
# Threshold above which an HTTP error code will be returned
status.threshold=RED
# Error code to be returned if above threshold
status.code=503
You can easily develop your own probes in a very similar fashion to the Healthcheck module, by taking inspiration from the existing probes available here on here on GitHub.
During its regular lifecycle, Jahia performs actions that shouldn't be interrupted by maintenance activities, such as server shutdown and database maintenance. By exposing such tasks, Jahia makes third-party platforms (or individuals) aware of when to avoid such interruptions.
The following query returns the list of critical tasks currently running on the server. The query returns the tasks running at the time the query was made. The Server Availability Manager does not keep a log of previously running tasks.
query {
admin {
jahia {
tasks {
service # Name of the service holding the task
name # Name of the tasks that should not be interrupted
started # Datetime at which the task started (if available)
}
}
}
}
To provide backward compatibility with older versions of Jahia, the module implements two approaches for identifying running tasks:
The following task are monitored under the core service:
The module has a registry of running tasks allowing modules that are not part of the Jahia default distribution to declare their own tasks. This can also be extended to external services willing to prevent a server from being restarted.
To register long-running tasks, we use the Jahia FrameworkService
and TaskRegisterEventHandler
of Server Availability Manager.
To register tasks from a Java module:
FrameworkService.sendEvent(
UNREGISTER_EVENT
, constructTaskDetailsEvent(workspace, WORKSPACE_INDEXATION), true);
UNREGISTER_EVENT
represents event topic - org/jahia/modules/sam/TaskRegistryService/UNREGISTER
, and constructTaskDetailsEvent
creates a map with three event properties:
ESService.java
classREGISTER
instead of UNREGISTER.
You can use the GraphQL API to create and delete tasks when external services need to inform a server that it should not be restarted. This example shows how to use createTask
. Using deleteTask
with the same parameters deletes that particular task.
mutation {
admin {
jahia {
createTask(service: "DevOps Team" name: "Network maintenance on Core VPC")
}
}
}
The registry is shared between GraphQL and Java modules, so you can create a task in a Java module and delete it using the GraphQL API.
Jahia also supports a set of additional operations associated with monitoring.
Working jointly with the tasks registry, a shutdown service is exposed via the GraphQL API. This API node should be used with care as it shuts down the Jahia server.
This query is provided as an example. Don’t use timeout
and force
together because force
triggers an immediate shutdown without considering the timeout
value.
mutation {
admin {
jahia {
shutdown(
# When dryRun is provided, the server will not be shutdown but
# still return the expected API response (true or false)
dryRun: true,
timeout: 25, # In seconds, maximum time to wait for server to be ready (empty list of tasks) to shutdown
force: true # Force immediate shutdown even if tasks are running
)
}
}
}
Load metrics are also available through the GraphQL API, as per the query below:
query {
admin {
jahia {
load {
requests {
count
# Interval can be ONE, FIVE, or FIFTEEN minutes
average(interval: ONE)
}
sessions {
count
average(interval: FIVE)
}
}
}
}
}