jCustomer logs, backups & background jobs
Some logging is configured in the centralized configuration file
<jcustomer-install-dir>/etc/unomi.custom.system.properties with the properties starting with
org.apache.unomi.logs.*. If you need more fine-grained configuration changes you can do them in the
<jcustomer-install-dir>/etc/org.ops4j.pax.logging.cfg file, so that by default the logging it routed into the
<jcustomer-install-dir>/data/log/karaf.log file. More details on how to tune logging settings and also on the log-related console commands is given here: https://karaf.apache.org/manual/latest/#_log. One of the most useful console commands (especially in development) is:
which continuously displays the log entries in the console.
How to backup jCustomer?
Backing up your system is useful in many cases as it minimizes the risk of losing all your data and a backup is a mandatory step in case of an upgrade or migration process. jCustomer by default is configured to write its runtime data directly into the Elasticsearch server, which itself writes information in its
<elasticsearch-install-dir>/data directory. There are several backup types, which serve different purposes:
- Elasticsearch snapshots
Elasticsearch also offers a built-in backup mechanism known as snapshots. You can find more information about this here: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html.
- Full jCustomer and Elasticsearch file system backup
Is done by archiving the whole
<elasticsearch-install-dir>folders, with jCustomer and Elasticsearch processes stopped.
- Configuration backup
Is done by archiving the
<elasticsearch-install-dir>/conf folders. Also, if you have modified any bin/setenv files, also backup those. This type of backup is usually done, before/after planned configuration updates.
- Runtime data file system backup
Is performed by archiving the
<elasticsearch-install-dir>/datafolder. Useful for incremental (nightly) backups and allows rolling back to a previous stable/consistent state in case of data corruption/loss. This procedure however is not recommended because transient data will not be consistent, and ElasticSearch snapshots should be preferred instead.
The recommended way of backing up jCustomer is therefore the following:
- Setup and execute Elasticsearch snapshots. This is the only way to properly backup an Elasticsearch cluster with full data integrity guaranteed. A recommended solution is to automate the snapshot process, as this can be achieved for example by using cron jobs that execute curl requests to request Elasticsearch snapshot generations.
- Make a full configuration backup for both jCustomer and Elasticsearch.
- Backup any customized changes you did (such as installed plugins) to jCustomer and Elasticsearch.
- (Optional) Full file system backup of jCustomer and Elasticsearch. This step is optional because if you have properly performed steps 1, 2 and 3 you should be able to reinstall everything you need to reinstall from the backups. However to be on the safe side a full file system backup is a good idea and doesn't require much work.
- Make sure you test your backup procedure to make sure that everything does backup properly and can effectively be restored. Remember that not testing your backup procedure is the same as having no backup procedure.
Note that this backup procedure will also work to copy environment to new clusters, even migrated to smaller cluster sizes (for example from 3 to 1 ES node for staging / development purposes). This is why snapshots are used, they make this type of migration easier. Another way of doing this would be to temporarily set the replicas to the same amount as nodes, but this method only works for small data sets and is not recommended for large ones.
This section contains a list of the background jobs that may be executed either by the jExperience Jahia modules or jCustomer .
|Retrieve cluster information from jCustomer (nodes, hosts, load) in order to be able to distribute load to all jCustomer nodes
|Every hour at 10 minutes
|Cancels (unschedules) and removes orphaned jExperience action jobs in case the corresponding content node is no longer present
Ask jCustomer to see if max hits are reached for optimization tests
The execution frequency of the job can be configured at your convenience.
But be careful, by disabling them you may have some features not working properly.
|Refresh all rules
|Every 1 seconds
|Reload all the rules to load rules in memory to avoid DB read everytime we have to evaluate rule matching for events.
|Refresh rules statistics
|Every 10 seconds
Rules have statistics stored in memory. By default, every 10 seconds, they are pushed and persisted to the database to avoid too much save operation.
Rules are executed and manipulated really often, that is why we use this timers.
|Reload of property types available for profiles
|Every 10 seconds
|This may be disabled if you are not deploying new property types on existing instances, but JExperience may deploy new ones
|Every 1 day
|Query all profiles and sessions that should be purged based on configured conditions
|Max inactive time for a profile before considered to be purged in next. This job is based on the profile property "last visit" (properties.lastVisit)
|Max existing time for a profile
|Max exist time for a profile before considered to be purged in next execution. This job is based on the profile property "First visit" (properties.firstVisit).
|Max existing time for a monthly indexes (event and session)
|Max exist time for a session or an event before considered as to be purged.
|Reload condition types
|Every 10 seconds
Reload condition types and actions types, like the property type this is useless on an existing instance that already have the condition types and action types configured.
If you dont want to introduce new condition type or action type, then this polling is useless.
|Every 1 day
|Update profile for past event condition.
|Can't be modified
|Load all segments and scoring
|Every 1 second
|Load all segments and scoring from DB to store them in memory. We do that again, to avoid direct read from DB when it come to manipulate segment and scoring, we do manipulate directly the in memory data. and we load asynchronously the segments and scoring.