Setting up a cluster
6.1 Discovery
6.1.1 Overview
Discovery must be configured on every layer of the cluster architecture. In this document we will only detail the discovery configuration that is possible for the Marketing Factory elements, that is to say the Apache Unomi and ElasticSearch components. For DX cluster discovery setup, please refer to the DX "Configuration and Fine Tuning Guide" documentation.
Apache Unomi relies on Apache Karaf Cellar, which in turn uses Hazelcast to discover and configure its cluster. You just need to install multiple Apache Unomis on the same network, and then (optionally) change the Hazelcast configuration in the following file: <cxs-install-dir>etc/hazelcast.xml
All nodes on the same network, sharing the same cluster name will be part of the same cluster. You can find more information about how to configure Hazelcast here: http://docs.hazelcast.org/docs/3.4/manual/html/networkconfiguration.html
For the actual ElasticSearch configuration however, this must be done using the following file:
<elasticsearch-install-dir>/config/elasticsearch.yml
The documentation for the various discovery options and how they may be configured is available here: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html Depending on the cluster size, you will want to adjust the following parameters to make sure your setup is optimal in terms of performance and safety. Here is an example of a typical ElasticSearch cluster configuration:
script.engine.groovy.inline.update: on
# Protect against accidental close/delete operations
# on all indices. You can still close/delete individual
# indices
#action.disable_close_all_indices: true
#action.disable_delete_all_indices: true
#action.disable_shutdown: true
discovery.zen.ping.unicast.hosts: ["es1-unomi.apache.org","es2-unomi.apache.org","es3-unomi.apache.org", "es4-unomi.apache.org"]
network.host: es1-unomi.apache.org
transport.tcp.compress: true
cluster.name: contextElasticSearchExample
http.cors.enabled: true
http.cors.allow-origin: "*" (edited)
6.1.2 Multicast
Multicast makes it easier to setup cluster node in an "automatic" discovery, which can be very useful for setting up environments quickly such as test environments. However it is not recommended for production or even environments where multiple installs might co-exist in the same network, unless the installers have some solid experience with multicast network setups. Also it is only available on Apache Unomi, as ElasticSearch no longer supports multicast setups.
6.1.2.1 Apache Unomi
For multicast configuration in Apache Unomi, you will need to modify the <cxs-install-dir>/etc/hazelcast.xml
file, more specifically the following section:
<join>
<multicast enabled="true">
<multicast-group>224.2.2.3</multicast-group>
<multicast-port>54327</multicast-port>
</multicast>
6.1.2.2 ElasticSearch
Multicast is not supported by ElasticSearch anymore, you must use Unicast or one of the other cloud discovery modules (Amazon EC2, Microsoft Azure, Google Cloud Compute Engine)
6.1.3 Unicast
6.1.3.1 Apache Unomi
The configuration for Unicast must be done in the <cxs-install-dir>/etc/hazelcast.xml
file on all nodes:
<join>
<multicast enabled="false">
<multicast-group>224.2.2.3</multicast-group>
<multicast-port>54327</multicast-port>
</multicast>
<tcp-ip enabled="true">
<interface>127.0.0.1</interface>
</tcp-ip>
6.1.3.2 ElasticSearch
In production, you must use unicast zen discovery. This works by providing ElasticSearch a list of nodes that it should try to contact. Once the node contacts a member of the unicast list, it will receive a full cluster state that lists all nodes in the cluster. It will then proceed to contact the master and join. The unicast protocol is activated by providing a list of hosts (IP + port number) to contact. The following changes have to be applied to the <elasticsearch-install-dir>/config/elasticsearch.yml
file:
discovery.zen.ping.unicast.hosts=["es1-unomi.apache.org:9300", "es2-unomi.apache.org:9300"]
6.2 Unomi key and event security
As documented in section 3.2.5 of the Installation guide, in a cluster the event security IP address list must be updated to include all the IPs of all the DX servers that will be talking to it. For example you must modify the org.apache.unomi.thirdparty.cfg file to look something like this:
thirdparty.provider1.key=<generated-unomi-key> thirdparty.provider1.ipAddresses=127.0.0.1,::1,127.0.0.2,127.0.0.3
6.3 Recommended configurations
In this section we provide examples of some Apache Unomi ElasticSearch settings for different cluster sizes. All of these must be configured in the following file:
org.apache.unomi.persistence.elasticsearch.cfg
This concerns mostly replication, which makes the setup more resistant to failures. For more information about ElasticSearch replication, please consult the following resource: https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html
The recommended number of nodes for an ElasticSearch cluster is 3-nodes, which strikes the perfect balance between reliability, scalability and performance. It is possible to run smaller clusters (even with a single node), but those will require downtime should anything happen to a node. Also since replicas are only recommended above 2 nodes, no redundence will exist in the system and only backups (using ElasticSearch snapshots) will protect again any failure.
6.3.1 - 3 node cluster
This setup sets up one replica since we have enough nodes to be able to use replicas without affecting performance too much. For all three nodes the configuration should look like this:
numberOfReplicas=1
numberOfShards=5
monthlyIndex.numberOfReplicas=1
monthlyIndex.numberOfShards=5
6.4 Clustering on Amazon Web Service
One critical thing when setting up ElasticSearch clusters on EC2 is the value of network.host in <elasticsearch-install-dir>/config/elasticsearch.yml
. The trick is to use the VPN/internal IP as network host (For instance, network.host: _eth0:ipv4_" - see https://www.elastic.co/guide/en/elasticsearch/reference/1.6/modules-network.html) to be sure to get the right IP if it’s dynamic and the public IP in the unicast discovery. The default value "192.168.0.1" doesn’t work on AWS.
6.5 Validating the cluster installation
6.5.1 ElasticSearch cluster validation
First, make sure you check all the logs of all the ElasticSearch cluster nodes and that there are no errors, especially errors or warnings about node to node communication. If you see any messages about trouble finding master (see troubleshooting section below), there is probably a problem with the installation.
You can then check the status of the ElasticSearch cluster by accessing the following URL:
http://ES_NODE_IP_ADDRESS:9200/_cat/health?v
You should see a green status. If not, check your installation because something is not set up correctly.
6.5.2 Apache Unomi cluster validation
First, check the logs in the data/logs directory and make sure there are no errors. If you find any errors, you should check your setup and fix any problems before going any further.
You can then perform the following request to test the status of the Apache Unomi cluster:
https://UNOMI_NODE_IP_ADDRESS:9443/cxs/cluster
If you get a warning about the site certificate that is normal because by default Apache Unomi ships with a self-signed certificate (which you should replace with your own once going in production). You will be prompted for a user name and password (by default karaf/karaf which you should also change for any permanent installation). If everything went well, you should get back a JSON structure indicating the active nodes in the cluster. Make sure that everything looks right before continuing the cluster install.
6.6 Cluster troubleshooting
In Apache Unomi, you might want to look at the cluster command lines available in Apache Karaf. Depending on how you launched Karaf, you may either use them directly on the console, or through an SSH connection to the server. To connect through SSH simply do:
ssh –p 8102 karaf@unomi
The default password is "karaf". You should really change this upon installation by modifying the <cxs-install-dir>/etc/users.properties
file. To get a list of available commands, simply type on the command line:
help
In order to monitor the state of your Unomi ElasticSearch Cluster, 2 URLs are quite useful: - Error! Hyperlink reference not valid. : This command retrieves the health of your ElasticSearch cluster (green is good). If you face communication issues between your cluster nodes, it will be orange or red. Make sure that all the nodes of your cluster are started in order to get a better idea of your cluster health. - Error! Hyperlink reference not valid. : This command gives you a detailed state of your Unomi node in the cluster.
Caused by: org.elasticsearch.action.UnavailableShardsException: [context][1] Not enough active copies to meet write consistency of [QUORUM] (have 1, needed 2). Timeout: [1m], request: index
[2017-02-08T10:15:23,571][WARN ][o.e.d.z.ElectMasterService] [RlyE1OZ] value for setting "discovery.zen.minimum_master_nodes" is too low. This can result in data loss! Please set it to at least a quorum of master-eligible nodes (current value: [-1], total number of master-eligible nodes used for publishing in this round: [2])
This happens if the cluster could not find enough nodes that are master eligible. Please note that when this warning appears queries to ElasticSearch will not work properly and so this problem should be resolved before setting up Marketing Factory on top of Apache Unomi. Please refer directly to ElasticSearch documentation for more information: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
6.7 Apache HTTP server setup
Here is an example of how to setup an Apache HTTP server to add as a load-balancer in front of 3 Apache Unomi nodes:
<IfModule mod_ssl.c>
<VirtualHost *:443>
ServerName unomi.acme.com
ServerAdmin monitor@acme.com
DocumentRoot /var/www/html
CustomLog /var/log/apache2/access-unomi.acme.com.log combined
ErrorLog /var/log/apache2/error-unomi.acme.com.log
<Directory />
Options FollowSymLinks
AllowOverride None
</Directory>
<Directory /var/www/html>
Options FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all
</Directory>
<Location /cxs>
#localhost subnet, add your own IPs for your DX servers here or MF might not work properly.
Require ip 127.0.0.1 10.100
</Location>
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^(TRACE|TRACK)
RewriteRule .* - [F]
ProxyPreserveHost On
ProxyPass /server-status !
ProxyPass /robots.txt !
RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} msnbot [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp
RewriteRule ^.* - [F,L]
ProxyPass / balancer://unomi_cluster/
ProxyPassReverse / balancer://unomi_cluster/
Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
<Proxy balancer://unomi_cluster>
BalancerMember http://unomi-node01.int.acme.com:8181 route=1 connectiontimeout=20 timeout=300 ttl=120
BalancerMember http://unomi-node02.int.acme.com:8181 route=2 connectiontimeout=20 timeout=300 ttl=120
BalancerMember http://unomi-node03.int.acme.com:8181 route=3 connectiontimeout=20 timeout=300 ttl=120
ProxySet lbmethod=bytraffic stickysession=ROUTEID
</Proxy>
RemoteIPHeader X-Forwarded-For
Include ssl-common.conf
BrowserMatch "MSIE [2-6]" \
nokeepalive ssl-unclean-shutdown \
downgrade-1.0 force-response-1.0
BrowserMatch "MSIE [17-9]" ssl-unclean-shutdown
# HSTS (mod_headers is required) (15768000 seconds = 6 months)
Header always set Strict-Transport-Security "max-age=15768000"
</VirtualHost>
</IfModule>
As you can see the mod_proxy module is used to perform load balancing using cookies for sticky sessions.