Audit trails

November 14, 2023

Introduction

Note that the Jahia Audit Trails modules are currently in a beta version, therefore they are not meant to be used in a production environment. This document is intended to explain the Jahia Audit Trails features and the current limitations.

Tracking operations performed on contents can be useful in many ways: such history can help troubleshooting issues, and providing reports / statistics about your site contents. As the needs regarding such information can be different from one to another, the storage location is a key element: file system, database, etc.

The Jahia audit trail collection of modules makes it possible to store data about events/actions performed on Jahia contents in any external system, by developing the corresponding consumer module. This will allow you to keep records of the operations performed on your contents where you want, in the appropriate format.

These modules are currently available in a beta version.

In this initial version, we only keep records of operations performed at a low level (JCR events), meaning that we do not store informations on “how” the operations were performed (for instance we do not make the difference between a cut/paste and a move by drag-n-drop, as at JCR level these operations are the same). Our modular approach will allow us in the future to collect other types of events (e.g. UI level events). The modules generating events are called “collectors”.
Once these events have been generated, they are made available to “consumer” modules, who can now store them in different systems, following different formats. Jahia provides two consumer modules:

  • one to log the events (so they can be stored on a file system)
  • one to store these events in an elasticsearch server: this will allow us to query the events and provide useful information on the content histories

This is a system level feature, meaning that you cannot select on which sites you want to enable it. These modules will generate events for all the sites and users.

The list of tracked events is available here. The currently known limitations are described below.

We value your opinion on this topic, please use the feedback form at the end of the page to share your comments and expectations.

Installation

System requirements for the beta version

  • DX 7.3.0.0+ or 7.2.1.1+ running with:
    • Tomcat or Websphere 8.5
      • JBoss will be supported in a future released version
    • JDK 8
  • Elasticsearch 5.5.0 (provided in the beta package)

Package download

Please download the audit-trail-modules beta package. It contains all the modules, in a beta version, mentioned in the following installation procedure, as well as a zip of an elasticsearch server.

Required modules

In order to use the audit trail modules, you need to deploy the following modules in Jahia:

  • audit-trail-service-1.0.0-SNAPSHOT.jar
  • audit-trail-collector-jcr-1.0.0-SNAPSHOT.jar

These modules are responsible of the collection of JCR events and to share them to consumer modules.

Logger

You need to deploy the module audit-trail-consumer-log-1.0.0-SNAPSHOT.jar in your Jahia server in order to log the events in the jahia.log file.

You can configure a new log file, dedicated for the storage of these events, by adding the following configuration to the WEB-INF/etc/config/log4j.xml file (it requires a server restart):

<appender name="AuditTrails" class="org.apache.log4j.DailyRollingFileAppender">
        <param name="File" value="${jahia.log.dir}audit-trails.log"/>
        <param name="Threshold" value="debug"/>
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%d: %-5p [%t] %c: %m%n"/>
        </layout>
</appender>

<logger name="org.jahia.modules.audittrail.consumer.log.LogAuditTrailConsumer">
        <level value="debug"/>
        <appender-ref ref="AuditTrails"/>
</logger>

Elasticsearch

Elasticsearch configuration

You can either use the elasticsearch server already contained in the audit-trails beta package, which is already configured for testing purposes, or you can follow the following instructions to set up your environment:

  1. Download ElasticSearch 5.5.0 here : https://www.elastic.co/downloads/past-releases/elasticsearch-5-5-0
  2. Decompress the archive and modify the following file : config/elasticsearch.yml
  3. Modify the following setting to this value: cluster.name: dx-audit-trail
  4. Start Elasticsearch: bin/elasticsearch
Note: In order to be able to use elastic head, you may need to add the following lines at the end of config/elasticsearch.yml:
http.cors.allow-origin: "*"
http.cors.enabled: true

Elasticsearch Jahia modules

In order to save the collected events to an elasticsearch, you need to deploy the following modules in your Jahia server.

These two modules are used to create a connection to the elasticsearch server:

  • database-connector-2.0-SNAPSHOT.jar
  • elasticsearch-connector-1.0-SNAPSHOT.jar
Note that these modules are not specific to the audit trail modules, they are/will be used by any Jahia product to connect to an elasticsearch server

The following two modules are required to store the events in elasticsearch:

  • audit-trail-consumer-persistence-1.0.0-SNAPSHOT.jar
  • audit-trail-persistence-elasticsearch-1.0.0-SNAPSHOT.jar

The following module provides a new site setting, therefore it needs to be enabled on the related site, and provide an interface to search for operations performed on contents.

  • audit-trail-search-panel-1.0.0-SNAPSHOT.jar

Setting up the Elasticsearch connection

Once all the modules are deployed, and the elasticsearch server started, you can configure the elasticsearch connection with the following steps:

Database connector

In the Jahia administration UI, go into Configuration -> Database Connector

  • Create a new connection, by clicking on the New connection button
  • Select the elastic database type
  • Create a new ElasticSearch connector with the following settings:
Host: localhost

Port: 9300

Id: esConnection

Cluster Name : dx-audit-trail

Click Create

Then the connection is created.

Audit Trail ES Setting

In the Jahia administration UI, go into Configuration -> Audit Trail ES Setting, and select the “esConnection” connection ID and click on Save.

Test the configuration

It is now time to test the configuration. Go in edit mode for one of your sites, and create a new page.

Log consumer

You should see an entry similar to the following one in your log files:

2017-09-22 14:55:07,016: INFO  [EventAdminThread #19] org.jahia.modules.audittrail.consumer.log.LogAuditTrailConsumer: Received audit event: org.osgi.service.event.Event@7c2ac6fe[
  topic=org/jahia/dx/audit/jcr/created
  properties={propertiesModified=[j:templateName, jcr:title], userProvider=default, workspace=default, siteKey=digitall, nodeTitle=test, nodePath=/sites/digitall/home/test, sessionLanguage=en, nodeUuid=253d2e19-d17c-4f71-9e1f-ad7c673771ff, nodeType=jnt:page, userId=root, timestamp=1506106507015}
]

Elasticsearch consumer

  • If you have elasticsearch-head, you can see a new entry:
{
    "_index": "dx-audit-trail-20170922",
    "_type": "event",
    "_id": "AV6q6AT4ufjQc09vc2Ji",
    "_version": 1,
    "_score": 1,
    "_source": {
        "propertiesModified": [
            "j:templateName"
            ,
            "jcr:title"
        ],
        "userProvider": "default",
        "workspace": "default",
        "siteKey": "digitall",
        "nodeTitle": "test",
        "topic": "org/jahia/dx/audit/jcr/created",
        "nodePath": "/sites/digitall/home/test",
        "sessionLanguage": "en",
        "nodeUuid": "ae9d7cc1-ceac-4602-90f0-09741d6d3df5",
        "nodeType": "jnt:page",
        "userId": "root",
        "timestamp": 1506105885941
    }
}
  • If you have enabled the audit trail search setting on your site, you can open it by going to the site settings and selecting “Audit trail search”, then directly click on the “Search” button:

audit-trails-6.png

Audit trail search panel

The audit trail search panel offers the possibility to search for operations that were saved in an elasticsearch server. This will allow you to retrieve the history of operations performed on your site contents, by narrowing the results using different filters (date range, user and path).

Search form

In the beta version of the search panel, it is possible to limit the operations to display by using several filters:

  • Date range: Use this option to limit the operations to display within selected dates
  • User: use this option to show only the operations performed by a specific user. To do so, please provide the exact username
  • Site section restriction: use this option to only display operations performed under a specific section of the site

audit-trails-7.png

In the V1, we intend to add more filter options:

  • Search by content UUID
  • Search by content path
  • Restrict results by operation type

Result table

In the beta version, the result table shows the following information:

  • Date of the operation
  • Operation type
    • jcr/created
    • jcr/moved
    • jcr/removed
    • jcr/updated
    • publication/published
  • Type of the content related to the operation
  • Displayable name: for display purpose only, we store the displayable name in the site default language at the time of the operation (as it can change during the lifetime of a content)
  • Path: path of the content when the operation happened. In case of a move operation, the path displayed is new one
  • Identifier: UUID of the content
  • User: username of the user who performed the operation
  • User provider: provider of the user who performed the operation

The result table is by default sorted by date, from the newest operation to the oldest one.

audit-trails-8.png

The following improvements regarding the result table are planned for the V1:

  • Add a “Details” column in order to provide more information on some operations (including the modified properties, the publication language and the source path of a move operation)

Known limitations

Cluster

The beta version of the audit trails module does not work in a cluster. We plan to fix this point for the version 1.

Missing events

During the development of the beta version, we have seen that some operations do not generate any event:

  • Marking a content for deletion does generate event, depending on the content (it does for a page, list news, it does not on simple texts): it should be covered by tracking mixin updates
  • Removing a visibility condition
  • Break/Restore ACL inheritance

Mixin updates

We currently do not generate any event when a mixin is set on/removed from a content. But an event will be generated if the mixin update adds/removes a property. For instance, selecting “Hide this page” in the edit engine of a page will not generate any event. We plan to fix this point for the version 1.

Unpublication

No events are currently generated when a content is unpublished. We plan to fix this point for the version 1.

Deletion by publication

In the beta version, publishing a content marked for deletion will generate a publication/publication operation. It is not possible to know when a content has been deleted by a publication process. We plan to fix this point for the version 1.

Deletion

The deletion of a content only generates an event for that given content. It means that all the children contents (e.g. subpages) do not have corresponding deletion events. This is a current technical limitation, and generating events for all the deleted children could be a performance killer and could be unreliable.

Modified properties

In the beta version, we only have the information of the properties that were modified, we do not have the language information in case of internationalized properties.

Renaming

Renaming a content, which means to change its system name, appears as a jcr/moved operation.

Modifications of ACL, vanity URLs and visibility conditions

In the JCR, ACLs, vanity URLs and visibility conditions are stored as subnodes of a content node. In the current version of the jcr-collector module, when the ACLs, vanity URLs or visibility conditions of a content are created/modified, the event stored is done using the subnodes of the content node. This means that the operation is not directly linked to the modified content in the report. For instance, if you edit the text of a simple text, and in the same operation you add a new vanity URL to the simple text, you will have two entries in the report, instead of one:

audit-trails-9.png

Considered evolutions

The scope of features related to these audit trail modules can easily grow. The list below is given as an example of the evolutions considered by Jahia on this topic, that could potentially be found in future versions.

Publication history

We can imagine saving in elasticsearch all the information related to a publication: the publisher, the date, the list of involved contents (new, modified, deleted), the language. We could then have global reports about all the publications, and the details of a publication to see the list of contents that were part of it.

Content history report

This report could be pretty close to what can currently be achieved by specifying the UUID of the content and without filtering anything else. This content history report could be opened from different locations (Search form, publication history, etc.)

UI generated events

The events generated by the audit trail modules do not currently include the IP address of the user. In order to provide such information, we need to catch events at UI level. This would also allow us to make the difference between a move and a rename operation, to make it clear when a content was copied/cut and pasted, etc.

Server administration reports

It might be interesting to have reports regarding administrative operations such like module deployment/start/stop/etc., operations on user (creation/update/etc.), server configuration, EDP operations (configuration/start/stop), etc.

Kibana integration

The way the events are saved in elasticsearch should allow the creation of Kibana reports. We would need to provide guidelines/instructions on how to include Kibana reports in a site setting panel.

Content archiving

The current design of the audit-trail-collector-jcr module should make possible to store the jcr node(s) in elasticsearch at content creation and update, without requiring a big refactoring. This would open number of new possibilities: retrieving/restoring previous versions of a content, tracking exact modifications, searching on deleted contents, etc.