Setting up a cluster
Discovery
Overview
Discovery must be configured on every layer of the cluster architecture. This topic only details the discovery configuration that is possible for the jExperience elements, that is to say the jCustomer and Elasticsearch components. For Jahia cluster discovery setup, refer to Installing, configuring and monitoring Jahia .
jCustomer relies on Apache Karaf Cellar, which in turn uses Hazelcast to discover and configure its cluster. You just need to install multiple jCustomers on the same network. You can then control some of the configuration using the centralized configuration properties in the <jcustomer-install-dir>/etc/unomi.custom.system.properties and then, if needed, (optionally) change the Hazelcast configuration in the following file: <jcustomer-install-dir>etc/hazelcast.xml but if you do this you will then have to manage these configuration changes when updating jCustomer manually, so if you can limit your changes to the unomi.custom.system.properties that would be the recommended way.
All nodes on the same network, sharing the same cluster name will be part of the same cluster. You can find more information about how to configure Hazelcast here: http://docs.hazelcast.org/docs/3.4/manual/html/networkconfiguration.html
For the actual Elasticsearch configuration however, this must be done using the following file:
<elasticsearch-install-dir>/config/elasticsearch.yml
The documentation for the various discovery options and how they may be configured is available here: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html Depending on the cluster size, you will want to adjust the following parameters to make sure your setup is optimal in terms of performance and safety. Here is an example of a typical Elasticsearch cluster configuration:
script.engine.groovy.inline.update: on
# Protect against accidental close/delete operations
# on all indices. You can still close/delete individual
# indices
#action.disable_close_all_indices: true
#action.disable_delete_all_indices: true
#action.disable_shutdown: true
discovery.zen.ping.unicast.hosts: ["es1-unomi.apache.org","es2-unomi.apache.org","es3-unomi.apache.org", "es4-unomi.apache.org"]
network.host: es1-unomi.apache.org
transport.tcp.compress: true
cluster.name: contextElasticSearchExample
http.cors.enabled: true
http.cors.allow-origin: "*" (edited)
Multicast
Multicast configuration is not supported for jCustomer or Elasticsearch, so please avoid using it. Prefer instead the Unicast configuration (see below).
Unicast
jCustomer
The configuration for Unicast must be done in the <jcustomer-install-dir>/etc/unomi.custom.system.properties
file on all nodes:
org.apache.unomi.hazelcast.tcp-ip.members=${env:UNOMI_HAZELCAST_TCPIP_MEMBERS:-127.0.0.1}
org.apache.unomi.hazelcast.tcp-ip.interface=${env:UNOMI_HAZELCAST_TCPIP_INTERFACE:-127.0.0.1}
org.apache.unomi.hazelcast.network.port=${env:UNOMI_HAZELCAST_NETWORK_PORT:-5701}
The members can be a comma separated list of IP addresses of the initial members of the Hazelcast cluster.
Elasticsearch
In production, you must use unicast zen discovery. This works by providing Elasticsearch a list of nodes that it should try to contact. Once the node contacts a member of the unicast list, it will receive a full cluster state that lists all nodes in the cluster. It will then proceed to contact the master and join. The unicast protocol is activated by providing a list of hosts (IP + port number) to contact. The following changes have to be applied to the <elasticsearch-install-dir>/config/elasticsearch.yml
file:
discovery.zen.ping.unicast.hosts=["es1-unomi.apache.org:9300", "es2-unomi.apache.org:9300"]
jCustomer key and event security
As documented in jCustomer key and event security in Installing Elasticsearch, jCustomer, and jExperience, in a cluster the event security IP address list must be updated to include all the IPs of all the Jahia servers that will be talking to it. For example you must create of modify the unomi.custom.system.properties
file to look something like this:
org.apache.unomi.thirdparty.provider1.key=<generated-unomi-key> org.apache.unomi.thirdparty.provider1.ipAddresses=127.0.0.1,::1,127.0.0.2,127.0.0.3
Startup sequence
Although jCustomer will wait for the Elasticsearch instance to be available before actually starting, there are situations in which multiple jCustomer nodes might fight for resource creation if all nodes are started in parallel.
The very first time an environment is created, it is recommended to begin by launching one single jCustomer node, let it start fully (the very first time, it will create various resources in Elasticsearch), then start the other jCustomer nodes.
Recommended configurations
This section provides examples of some jCustomer Elasticsearch settings for different cluster sizes. All of these must be configured in the following file (that must be created if it doesn't exist):
unomi.custom.system.properties
This concerns mostly replication, which makes the setup more resistant to failures. For more information about Elasticsearch replication, please consult the following resource: https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html
The recommended number of nodes for an Elasticsearch cluster is 3-nodes, which strikes the perfect balance between reliability, scalability and performance. It is possible to run smaller clusters (even with a single node), but those will require downtime should anything happen to a node. Also since replicas are only recommended above 2 nodes, no redundancies will exist in the system and only backups (using Elasticsearch snapshots) will protect again any failure.
Three node cluster
This setup sets up one replica since we have enough nodes to be able to use replicas without affecting performance too much. For all three nodes the configuration should look like this:
org.apache.unomi.elasticsearch.monthlyIndex.nbShards=5
org.apache.unomi.elasticsearch.monthlyIndex.nbReplicas=1
org.apache.unomi.elasticsearch.defaultIndex.nbShards=5
org.apache.unomi.elasticsearch.defaultIndex.nbReplicas=1
Clustering on Amazon Web Service
One critical thing when setting up Elasticsearch clusters on EC2 is the value of network.host in <elasticsearch-install-dir>/config/elasticsearch.yml
. The trick is to use the VPN/internal IP as network host (For instance, network.host: _eth0:ipv4_" - see https://www.elastic.co/guide/en/elasticsearch/reference/1.6/modules-network.html) to be sure to get the right IP if it’s dynamic and the public IP in the unicast discovery. The default value "192.168.0.1" doesn’t work on AWS.
Validating the cluster installation
Elasticsearch cluster validation
First, make sure you check all the logs of all the Elasticsearch cluster nodes and that there are no errors, especially errors or warnings about node to node communication. If you see any messages about trouble finding master (see troubleshooting section below), there is probably a problem with the installation.
You can then check the status of the Elasticsearch cluster by accessing the following URL:
http://ES_NODE_IP_ADDRESS:9200/_cat/health?v
You should see a green status. If not, check your installation because something is not set up correctly.
jCustomer cluster validation
First, check the logs in the data/logs directory and make sure there are no errors. If you find any errors, you should check your setup and fix any problems before going any further.
You can then perform the following request to test the status of the jCustomer cluster:
https://UNOMI_NODE_IP_ADDRESS:9443/cxs/cluster
If you get a warning about the site certificate that is normal because by default jCustomer ships with a self-signed certificate (which you should replace with your own once going in production). You will be prompted for a user name and password (by default karaf/karaf which you should also change for any permanent installation). If everything went well, you should get back a JSON structure indicating the active nodes in the cluster. Make sure that everything looks right before continuing the cluster install.
Cluster troubleshooting
In jCustomer, you might want to look at the cluster command lines available in Apache Karaf. Depending on how you launched Karaf, you may either use them directly on the console, or through an SSH connection to the server. To connect through SSH simply do:
ssh –p 8102 karaf@unomi
The default password is "karaf". You should really change the default password upon installation by modifying the <jcustomer-install-dir>/etc/unomi.custom.system.properties
file and changing the value for the org.apache.unomi.security.root.password
property . To get a list of available commands, simply type on the command line:
help
In order to monitor the state of your jCustomer Elasticsearch Cluster, 2 URLs are quite useful: http://localhost:9200/_cluster/health?pretty
: This command retrieves the health of your Elasticsearch cluster (green is good). If you face communication issues between your cluster nodes, it will be orange or red. Make sure that all the nodes of your cluster are started in order to get a better idea of your cluster health. https://UNOMI_NODE_IP_ADDRESS:9443/cxs/cluster
: This command gives you a detailed state of your jCustomer node in the cluster.
Caused by: org.elasticsearch.action.UnavailableShardsException: [context][1] Not enough active copies to meet write consistency of [QUORUM] (have 1, needed 2). Timeout: [1m], request: index
[2017-02-08T10:15:23,571][WARN ][o.e.d.z.ElectMasterService] [RlyE1OZ] value for setting "discovery.zen.minimum_master_nodes" is too low. This can result in data loss! Please set it to at least a quorum of master-eligible nodes (current value: [-1], total number of master-eligible nodes used for publishing in this round: [2])
This happens if the cluster could not find enough nodes that are master eligible. Please note that when this warning appears queries to Elasticsearch will not work properly and so this problem should be resolved before setting up jExperience on top of jCustomer. Please refer directly to Elasticsearch documentation for more information: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
Apache HTTP server setup
Here is an example of how to setup an Apache HTTP server to add as a load-balancer in front of 3 jCustomer nodes:
<IfModule mod_ssl.c>
<VirtualHost *:443>
ServerName unomi.acme.com
ServerAdmin monitor@acme.com
DocumentRoot /var/www/html
CustomLog /var/log/apache2/access-unomi.acme.com.log combined
ErrorLog /var/log/apache2/error-unomi.acme.com.log
<Directory />
Options FollowSymLinks
AllowOverride None
</Directory>
<Directory /var/www/html>
Options FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all
</Directory>
<Location /cxs>
#localhost subnet, add your own IPs for your Jahia servers here or jExperience might not work properly.
Require ip 127.0.0.1 10.100
</Location>
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK)
RewriteRule .* - [F]
ProxyPreserveHost On
ProxyPass /server-status !
ProxyPass /robots.txt !
RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} msnbot [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp
RewriteRule ^.* - [F,L]
ProxyPass / balancer://unomi_cluster/
ProxyPassReverse / balancer://unomi_cluster/
Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
<Proxy balancer://unomi_cluster>
BalancerMember http://unomi-node01.int.acme.com:8181 route=1 connectiontimeout=20 timeout=300 ttl=120
BalancerMember http://unomi-node02.int.acme.com:8181 route=2 connectiontimeout=20 timeout=300 ttl=120
BalancerMember http://unomi-node03.int.acme.com:8181 route=3 connectiontimeout=20 timeout=300 ttl=120
ProxySet lbmethod=bytraffic stickysession=ROUTEID
</Proxy>
RemoteIPHeader X-Forwarded-For
Include ssl-common.conf
BrowserMatch "MSIE [2-6]" \
nokeepalive ssl-unclean-shutdown \
downgrade-1.0 force-response-1.0
BrowserMatch "MSIE [17-9]" ssl-unclean-shutdown
# HSTS (mod_headers is required) (15768000 seconds = 6 months)
Header always set Strict-Transport-Security "max-age=15768000"
</VirtualHost>
</IfModule>
As you can see the mod_proxy module is used to perform load balancing using cookies for sticky sessions.