Distributed Caching

As of version 1.2, Ehcache can be used as a distributed cache.

The distribution feature is built using plugins. Ehcache comes with some default distribution plugins which should be suitable for most applications. Other plugins can be developed. Developers should see the source code in the distribution package for the fullly documented API to see how to do that.

Though not necessary to use distributed caching an insight into the design decisions used in ehcache may be helpful. See the Design of distributed ehcache page.

The rest of this section documents the distribution plugins which are bundled with ehcache.

The following concepts are central to cache distribution:

  • How do you know about the other caches that are in your cluster?
  • What form of communication will be used to distribute messages?
  • What is replicated? Puts, Updates, Expiries?
  • When is it replicated? Synchronous or asynchronous?

To set up distributed caching you need to configure a PeerProvider, a CacheManagerPeerListener, which is done globally for a CacheManager. For each cache that will operate distributed, you then need to add a cacheEventListener to propagate messages.

Suitable Element Types

Only Serializable Elements are suitable for replication.

Some operations, such as remove, work off Element keys rather than the full Element itself. In this case the operation will be replicated provided the key is Serializable, even if the Element is not.

Peer Discovery

Ehcache has the notion of a group of caches acting as a distributed cache. Each of the caches is a peer to the others. There is no master cache. How do you know about the other caches that are in your cluster? This problem can be given the name Peer Discovery.

Ehcache provides two mechanisms for peer discovery, just like a car: manual and automatic.

To use one of the built-in peer discovery mechanisms specify the class attribute of cacheManagerPeerProviderFactory as net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory in the ehcache.xml configuration file.

Automatic Discovery

Automatic discovery uses TCP multicast to establish and maintain a multicast group. It features minimal configuration and automatic addition to and deletion of members from the group. No a priori knowledge of the servers in the cluster is required. This is recommended as the default option.

Peers send heartbeats to the group once per second. If a peer has not been heard of for 5 seconds it is dropped from the group. If a new peer starts sending heartbeats it is admitted to the group.

Any cache within the configuration set up as replicated will be made available for discovery by other peers.

To set automatic peer discovery, specify the properties attribute of cacheManagerPeerProviderFactory as follows:

peerDiscovery=automatic multicastGroupAddress=multicast address | multicast host name multicastGroupPort=port

Example

Suppose you have two servers in a cluster. You wish to distribute sampleCache11 and sampleCache12. The configuration required for each server is identical:

Configuration for server1 and server2

 <cacheManagerPeerProviderFactory
 class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"

 properties="peerDiscovery=automatic, multicastGroupAddress=230.0.0.1,
 multicastGroupPort=4446"/>

Manual Peer Discovery

Manual peer configuration requires the IP address and port of each listener to be known. Peers cannot be added or removed at runtime. Manual peer discovery is recommended where there are technical difficulties using multicast, such as a router between servers in a cluster that does not propagate multicast datagrams. You can also use it to set up one way replications of data, by having server2 know about server1 but not vice versa.

To set manual peer discovery, specify the properties attribute of cacheManagerPeerProviderFactory as follows: peerDiscovery=manual rmiUrls=//server:port/cacheName, ...

The rmiUrls is a list of the cache peers of the server being configured. Do not include the server being configured in the list.

Example

Suppose you have two servers in a cluster. You wish to distribute sampleCache11 and sampleCache12. Following is the configuration required for each server:

Configuration for server1

 <cacheManagerPeerProviderFactory
 class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"

 properties="peerDiscovery=manual,
 rmiUrls=//server2:40001/sampleCache11|//server2:40001/sampleCache12"/>

Configuration for server2

 <cacheManagerPeerProviderFactory
 class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"

 properties="peerDiscovery=manual,
 rmiUrls=//server1:40001/sampleCache11|//server1:40001/sampleCache12"/>

Configuring a CacheManagerPeerListener

A CacheManagerPeerListener listens for messages from peers to the current CacheManager.

You configure the CacheManagerPeerListener by specifiying a CacheManagerPeerListenerFactory which is used to create the CacheManagerPeerListener using the plugin mechanism.

The attributes of cacheManagerPeerListenerFactory are:

  • class - a fully qualified factory class name * properties - comma separated properties having meaning only to the factory.

    Ehcache comes with a built-in RMI-based distribution system. The listener component is RMICacheManagerPeerListener which is configured using RMICacheManagerPeerListenerFactory. It is configured as per the following example:

    <cacheManagerPeerListenerFactory
    class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"
    
    properties="hostName=localhost, port=40001,
    socketTimeoutMillis=2000"/>

    Valid properties are:

  • hostName (optional) - the hostName of the host the listener is running on. Specify where the host is multihomed and you want to control the interface over which cluster messages are received.

    The hostname is checked for reachability during CacheManager initialisation.

    If the hostName is unreachable, the CacheManager will refuse to start and an CacheException will be thrown indicating connection was refused.

    If unspecified, the hostname will use InetAddress.getLocalHost().getHostAddress(), which corresponds to the default host network interface.

    Warning: Explicitly setting this to localhost refers to the local loopback of 127.0.0.1, which is not network visible and will cause no replications to be received from remote hosts. You should only use this setting when multiple CacheManagers are on the same machine.

  • port (mandatory) - the port the listener listens on.
  • socketTimeoutMillis (optional) - the number of seconds client sockets will wait when sending messages to this listener until they give up. By default this is 2000ms.

Configuring CacheReplicators

Each cache that will be distributed needs to set a cache event listener which then replicates messages to the other CacheManager peers. This is done by adding a cacheEventListenerFactory element to each cache's configuration.

    <!-- Sample cache named sampleCache2. -->
    <cache name="sampleCache2"
           maxElementsInMemory="10"
           eternal="false"
           timeToIdleSeconds="100"
           timeToLiveSeconds="100"
           overflowToDisk="false">
        <cacheEventListenerFactory class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"
                                   properties="replicateAsynchronously=true, replicatePuts=true, replicateUpdates=true, replicateUpdatesViaCopy=false, replicateRemovals=true "/>
    </cache>

class - use net.sf.ehcache.distribution.RMICacheReplicatorFactory

The factory recognises the following properties:

  • replicatePuts=true | false - whether new elements placed in a cache are replicated to others. Defaults to true.
  • replicateUpdates=true | false - whether new elements which override an element already existing with the same key are replicated. Defaults to true.
  • replicateRemovals=true - whether element removals are replicated. Defaults to true.
  • replicateAsynchronously=true | false - whether replications are asyncrhonous (true) or synchronous (false). Defaults to true.
  • replicateUpdatesViaCopy=true | false - whether the new elements are copied to other caches (true), or whether a remove message is sent. Defaults to true.

To reduce typing if you want default behaviour, which is replicate everything in asynchronous mode, you can leave off the RMICacheReplicatorFactory properties as per the following example:

    <!-- Sample cache named sampleCache4. All missing RMICacheReplicatorFactory properties default to true -->
    <cache name="sampleCache4"
           maxElementsInMemory="10"
           eternal="true"
           overflowToDisk="false"
           memoryStoreEvictionPolicy="LFU">
        <cacheEventListenerFactory class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"/>
    </cache>