Audit trails
Introduction
Tracking operations performed on contents can be useful in many ways: such history can help troubleshooting issues, and providing reports / statistics about your site contents. As the needs regarding such information can be different from one to another, the storage location is a key element: file system, database, etc.
The Jahia audit trail collection of modules makes it possible to store data about events/actions performed on Jahia contents in any external system, by developing the corresponding consumer module. This will allow you to keep records of the operations performed on your contents where you want, in the appropriate format.
These modules are currently available in a beta version.
In this initial version, we only keep records of operations performed at a low level (JCR events), meaning that we do not store informations on “how” the operations were performed (for instance we do not make the difference between a cut/paste and a move by drag-n-drop, as at JCR level these operations are the same). Our modular approach will allow us in the future to collect other types of events (e.g. UI level events). The modules generating events are called “collectors”.
Once these events have been generated, they are made available to “consumer” modules, who can now store them in different systems, following different formats. Jahia provides two consumer modules:
- one to log the events (so they can be stored on a file system)
- one to store these events in an elasticsearch server: this will allow us to query the events and provide useful information on the content histories
This is a system level feature, meaning that you cannot select on which sites you want to enable it. These modules will generate events for all the sites and users.
The list of tracked events is available here. The currently known limitations are described below.
We value your opinion on this topic, please use the feedback form at the end of the page to share your comments and expectations.
Installation
System requirements for the beta version
- DX 7.3.0.0+ or 7.2.1.1+ running with:
- Tomcat or Websphere 8.5
- JBoss will be supported in a future released version
- JDK 8
- Tomcat or Websphere 8.5
- Elasticsearch 5.5.0 (provided in the beta package)
Package download
Please download the audit-trail-modules beta package. It contains all the modules, in a beta version, mentioned in the following installation procedure, as well as a zip of an elasticsearch server.
Required modules
In order to use the audit trail modules, you need to deploy the following modules in Jahia:
- audit-trail-service-1.0.0-SNAPSHOT.jar
- audit-trail-collector-jcr-1.0.0-SNAPSHOT.jar
These modules are responsible of the collection of JCR events and to share them to consumer modules.
Logger
You need to deploy the module audit-trail-consumer-log-1.0.0-SNAPSHOT.jar in your Jahia server in order to log the events in the jahia.log file.
You can configure a new log file, dedicated for the storage of these events, by adding the following configuration to the WEB-INF/etc/config/log4j.xml file (it requires a server restart):
<appender name="AuditTrails" class="org.apache.log4j.DailyRollingFileAppender"> <param name="File" value="${jahia.log.dir}audit-trails.log"/> <param name="Threshold" value="debug"/> <layout class="org.apache.log4j.PatternLayout"> <param name="ConversionPattern" value="%d: %-5p [%t] %c: %m%n"/> </layout> </appender> <logger name="org.jahia.modules.audittrail.consumer.log.LogAuditTrailConsumer"> <level value="debug"/> <appender-ref ref="AuditTrails"/> </logger>
Elasticsearch
Elasticsearch configuration
You can either use the elasticsearch server already contained in the audit-trails beta package, which is already configured for testing purposes, or you can follow the following instructions to set up your environment:
- Download ElasticSearch 5.5.0 here : https://www.elastic.co/downloads/past-releases/elasticsearch-5-5-0
- Decompress the archive and modify the following file :
config/elasticsearch.yml
- Modify the following setting to this value:
cluster.name: dx-audit-trail
- Start Elasticsearch:
bin/elasticsearch
http.cors.allow-origin: "*"
http.cors.enabled: true
Elasticsearch Jahia modules
In order to save the collected events to an elasticsearch, you need to deploy the following modules in your Jahia server.
These two modules are used to create a connection to the elasticsearch server:
- database-connector-2.0-SNAPSHOT.jar
- elasticsearch-connector-1.0-SNAPSHOT.jar
The following two modules are required to store the events in elasticsearch:
- audit-trail-consumer-persistence-1.0.0-SNAPSHOT.jar
- audit-trail-persistence-elasticsearch-1.0.0-SNAPSHOT.jar
The following module provides a new site setting, therefore it needs to be enabled on the related site, and provide an interface to search for operations performed on contents.
- audit-trail-search-panel-1.0.0-SNAPSHOT.jar
Setting up the Elasticsearch connection
Once all the modules are deployed, and the elasticsearch server started, you can configure the elasticsearch connection with the following steps:
Database connector
In the Jahia administration UI, go into Configuration -> Database Connector
- Create a new connection, by clicking on the New connection button
- Select the elastic database type
- Create a new ElasticSearch connector with the following settings:
Host: localhost Port: 9300 Id: esConnection Cluster Name : dx-audit-trail
Click Create
Then the connection is created.
Audit Trail ES Setting
In the Jahia administration UI, go into Configuration -> Audit Trail ES Setting, and select the “esConnection” connection ID and click on Save.
Test the configuration
It is now time to test the configuration. Go in edit mode for one of your sites, and create a new page.
Log consumer
You should see an entry similar to the following one in your log files:
2017-09-22 14:55:07,016: INFO [EventAdminThread #19] org.jahia.modules.audittrail.consumer.log.LogAuditTrailConsumer: Received audit event: org.osgi.service.event.Event@7c2ac6fe[ topic=org/jahia/dx/audit/jcr/created properties={propertiesModified=[j:templateName, jcr:title], userProvider=default, workspace=default, siteKey=digitall, nodeTitle=test, nodePath=/sites/digitall/home/test, sessionLanguage=en, nodeUuid=253d2e19-d17c-4f71-9e1f-ad7c673771ff, nodeType=jnt:page, userId=root, timestamp=1506106507015} ]
Elasticsearch consumer
- If you have elasticsearch-head, you can see a new entry:
{ "_index": "dx-audit-trail-20170922", "_type": "event", "_id": "AV6q6AT4ufjQc09vc2Ji", "_version": 1, "_score": 1, "_source": { "propertiesModified": [ "j:templateName" , "jcr:title" ], "userProvider": "default", "workspace": "default", "siteKey": "digitall", "nodeTitle": "test", "topic": "org/jahia/dx/audit/jcr/created", "nodePath": "/sites/digitall/home/test", "sessionLanguage": "en", "nodeUuid": "ae9d7cc1-ceac-4602-90f0-09741d6d3df5", "nodeType": "jnt:page", "userId": "root", "timestamp": 1506105885941 } }
- If you have enabled the audit trail search setting on your site, you can open it by going to the site settings and selecting “Audit trail search”, then directly click on the “Search” button:
Audit trail search panel
The audit trail search panel offers the possibility to search for operations that were saved in an elasticsearch server. This will allow you to retrieve the history of operations performed on your site contents, by narrowing the results using different filters (date range, user and path).
Search form
In the beta version of the search panel, it is possible to limit the operations to display by using several filters:
- Date range: Use this option to limit the operations to display within selected dates
- User: use this option to show only the operations performed by a specific user. To do so, please provide the exact username
- Site section restriction: use this option to only display operations performed under a specific section of the site
In the V1, we intend to add more filter options:
- Search by content UUID
- Search by content path
- Restrict results by operation type
Result table
In the beta version, the result table shows the following information:
- Date of the operation
- Operation type
- jcr/created
- jcr/moved
- jcr/removed
- jcr/updated
- publication/published
- Type of the content related to the operation
- Displayable name: for display purpose only, we store the displayable name in the site default language at the time of the operation (as it can change during the lifetime of a content)
- Path: path of the content when the operation happened. In case of a move operation, the path displayed is new one
- Identifier: UUID of the content
- User: username of the user who performed the operation
- User provider: provider of the user who performed the operation
The result table is by default sorted by date, from the newest operation to the oldest one.
The following improvements regarding the result table are planned for the V1:
- Add a “Details” column in order to provide more information on some operations (including the modified properties, the publication language and the source path of a move operation)
Known limitations
Cluster
The beta version of the audit trails module does not work in a cluster. We plan to fix this point for the version 1.
Missing events
During the development of the beta version, we have seen that some operations do not generate any event:
- Marking a content for deletion does generate event, depending on the content (it does for a page, list news, it does not on simple texts): it should be covered by tracking mixin updates
- Removing a visibility condition
- Break/Restore ACL inheritance
Mixin updates
We currently do not generate any event when a mixin is set on/removed from a content. But an event will be generated if the mixin update adds/removes a property. For instance, selecting “Hide this page” in the edit engine of a page will not generate any event. We plan to fix this point for the version 1.
Unpublication
No events are currently generated when a content is unpublished. We plan to fix this point for the version 1.
Deletion by publication
In the beta version, publishing a content marked for deletion will generate a publication/publication operation. It is not possible to know when a content has been deleted by a publication process. We plan to fix this point for the version 1.
Deletion
The deletion of a content only generates an event for that given content. It means that all the children contents (e.g. subpages) do not have corresponding deletion events. This is a current technical limitation, and generating events for all the deleted children could be a performance killer and could be unreliable.
Modified properties
In the beta version, we only have the information of the properties that were modified, we do not have the language information in case of internationalized properties.
Renaming
Renaming a content, which means to change its system name, appears as a jcr/moved operation.
Modifications of ACL, vanity URLs and visibility conditions
In the JCR, ACLs, vanity URLs and visibility conditions are stored as subnodes of a content node. In the current version of the jcr-collector module, when the ACLs, vanity URLs or visibility conditions of a content are created/modified, the event stored is done using the subnodes of the content node. This means that the operation is not directly linked to the modified content in the report. For instance, if you edit the text of a simple text, and in the same operation you add a new vanity URL to the simple text, you will have two entries in the report, instead of one:
Considered evolutions
The scope of features related to these audit trail modules can easily grow. The list below is given as an example of the evolutions considered by Jahia on this topic, that could potentially be found in future versions.
Publication history
We can imagine saving in elasticsearch all the information related to a publication: the publisher, the date, the list of involved contents (new, modified, deleted), the language. We could then have global reports about all the publications, and the details of a publication to see the list of contents that were part of it.
Content history report
This report could be pretty close to what can currently be achieved by specifying the UUID of the content and without filtering anything else. This content history report could be opened from different locations (Search form, publication history, etc.)
UI generated events
The events generated by the audit trail modules do not currently include the IP address of the user. In order to provide such information, we need to catch events at UI level. This would also allow us to make the difference between a move and a rename operation, to make it clear when a content was copied/cut and pasted, etc.
Server administration reports
It might be interesting to have reports regarding administrative operations such like module deployment/start/stop/etc., operations on user (creation/update/etc.), server configuration, EDP operations (configuration/start/stop), etc.
Kibana integration
The way the events are saved in elasticsearch should allow the creation of Kibana reports. We would need to provide guidelines/instructions on how to include Kibana reports in a site setting panel.
Content archiving
The current design of the audit-trail-collector-jcr module should make possible to store the jcr node(s) in elasticsearch at content creation and update, without requiring a big refactoring. This would open number of new possibilities: retrieving/restoring previous versions of a content, tracking exact modifications, searching on deleted contents, etc.