HA Redis clustering from scratch

17 min readMar 9, 2021

Part One

Hello, in this cycle of article’s we’ll focus on the most popular ways of creating the highly available Redis setups. And will try to take a detailed view of the existing ways of implementing the Redis clustering for the production needs.

Redis one of the most popular key-value database today and you can meet with it on a lot of projects in different places, from small websites up to complicated distributed applications. In this article, we’ll use the community Redis version, that as you know have some differences and restrictions from the enterprise version.

The main goal of the Redis-like databases — speed, they use the RAM for storing the data, you’ll get speed for the read-write requests but also risks losing your data at any time your Redis host system will have issues or reboots. For preventing this Redis allows creating the replication of the database between the few hosts in master-slave style, and also saving the database snapshots to the system hard drive. Unfortunately, this easy replication solution will lead you to manually database recovering by loading the data from the disk snapshots or switching your application to use slave node after master will go down. That’s why the Redis community created a few more complicated ways of constructing the HA Redis setups, like Redis Sentinel or Redis Cluster. They can make the automated recovery after the possible issues and provide a way to create really strong HA Redis clusters.

Of course, you’ll need some additional software to be used together with Redis to provide this high available clustering, like HAProxy and Keepalived or the cloud providers LB solutions or even SAAS level decisions.

In the first part of this article, we’ll create a simple but really highly available Redis cluster from one master node and two slaves with Redis Sentinel.

Redis Sentinel provides high availability for Redis. In practical terms this means that using Sentinel you can create a Redis deployment that resists without human intervention certain kinds of failures.

For this solution, we’ll need a minimum of three nodes in our cluster, you can use on-premise servers or virtual machines on any of the cloud providers instances, it’s all up to you. In this setup, only one node will be able to handle the read-write requests and will take all load, the few other nodes will be on hot reserve and will come into play if the master node experiences any serious problems.

Also, we’ll use Docker and Docker-compose for the quick setup of all the needed software packages. Please mind that we don’t use any container orchestration tools like Docker Swarm or Kubernetes in this first part of the article and our Docker containers will use the host networking of the servers.

The one component that will be running outside the Docker’s Keepalived package, which will implement the first line of load balancing and give us the floating IP address that will migrate between the cluster nodes providing the fault tolerance of this solution.

The logical schema of this example will look this way:

There will be three nodes with the same software packages but with a bit different configuration of them. First of all, we need to place our nodes in the same L2 network layer for providing the floating IP between them using the Keepalived VRRP setup. Also, the Keepalived will make the monitoring of the HAProxy service on the same node with him, and move the floating IP to the other node if this service will fail.

At one time only one HAProxy service will serve all requests and send them to the master Redis node. The HAProxy will checks which of the Redis now runs as master and route all requests only into this node. If the other Redis node will start to be a master, the HAProxy will quickly reroute all requests into it. The last part of this schema the Redis Sentinel, this service will create a quorum from the three nodes and check the Redis master state and then reconfigure the slave node into the master state if the previous Redis master service will fault.

In the end, this setup will provide us with the HA Redis implementation, which can save you from losing important data and idling any requests also. This schema can handle the total losing one of the cluster nodes, also it’ll provide automation with switching the slave nodes into master and returning the lost master node back into the cluster as slaves one. And all this will be done fully automated, so you’ll no need manual attention for it.

Now let’s start with creating the Redis and Redis Sentinel services and the load balancer HAProxy setup first, all these steps will be the same one for all three nodes, but with small changes in the configuration files of the Redis service. As I said we’ll use Docker to run these services as the easiest modern way of running and updating any services. Also, we’ll use the docker-compose decision to simplify running all these parts together.

Needed steps:

Choose the subnet for your cluster, for this setup you’ll need to put all nodes in the same L2 network to be able to use floating IP with the Keepalived.
192.168.0.0/24 will be the subnet for this example.
192.168.0.1 will be the virtual VRRP floating IP, that will be assigned to one node per time, Keepalived will take care of it, also this IP will be the main IP for all requests into this Redis cluster.
192.168.0.1 was taken just for example if you’ll use the 192.168.0.0/24 subnet you can choose the other IP as a gateway IP.
You’ll need to install Docker & Docker Compose on all three nodes

If you’ll use the cloud provider for creating nodes(instances) for the Redis services then it very possible that you’ll no need or even can’t use the Keepalived in your setup, in opposite you need to use the existing load-balancer services from that cloud provider, for routing requests to the HAproxy services in this cluster.

First, let’s create all directories on these three nodes:

node1# mkdir -p /opt/redis_ha/{redis,haproxy} && mkdir /opt/redis_ha/redis/data && cd /opt/redis_ha/node2# mkdir -p /opt/redis_ha/{redis,haproxy} && mkdir /opt/redis_ha/redis/data && cd /opt/redis_ha/node3# mkdir -p /opt/redis_ha/{redis,haproxy} && mkdir /opt/redis_ha/redis/data && cd /opt/redis_ha/

Now we need to create a docker-compose.yml file, where describes the configuration of the HAProxy and Redis and Sentinel docker containers. This is the easiest way of configuration the docker containers when you need to start more than one containers that logically belong to the same application.

Create this config on all three nodes, the docker-compose file will be the same for all of them:

node1:/opt/redis_ha# vi docker-compose.ymlversion: '2' 
 
services: 
 
  redis: 
    image: redis:latest 
    container_name: redis 
    restart: always 
    volumes: 
      - /opt/redis_ha/redis:/usr/local/etc/redis 
      - /opt/redis_ha/redis/data:/data 
    command: redis-server /usr/local/etc/redis/redis.conf 
    network_mode: "host" 
 
 
  sentinel: 
    image: redis:latest 
    container_name: sentinel 
    restart: always 
    volumes: 
      - /opt/redis_ha/redis:/usr/local/etc/redis 
    command: redis-server /usr/local/etc/redis/sentinel.conf --sentinel 
    network_mode: "host" 
    depends_on: 
      - redis 
 
 
  haproxy: 
    image: haproxy:2.3 
    container_name: haproxy 
    restart: always 
    volumes: 
      - /opt/redis_ha/haproxy:/usr/local/etc/haproxy 
    network_mode: "host"
    depends_on: 
      - redis

This is a very simple configuration that contains three docker images: Redis server, Redis Sentinel service, and the HAProxy load balancer service. There we using the host network mode, this will open docker container ports on the main host IPs, also we describe some volumes that will contain the container’s configuration files and data.

In the /opt/redis_ha/redis and /opt/redis_ha/haproxy directories, we’ll put the Redis, Sentinel and HAProxy configuration files. In the /opt/redis_ha/redis/data the Redis will store the RDB database file if it’ll be needed.

Now let’s create a Redis configuration file, the first node will be the master node from the beginning:

node1# cd /opt/redis_ha/redis
node1:/opt/redis_ha/redis# vi redis.conf# Main part
bind 192.168.0.2
requirepass somepassword
masterauth somepassword
protected-mode yes
port 6379# Small tuning
tcp-keepalive 0
maxmemory 5gb
maxmemory-policy volatile-lru

This is a very basic but fully enough configuration, you’ll need to set your own password and change the maxmemory parameter.

The second node Redis configuration file will be:

node2# cd /opt/redis_ha/redis
node2:/opt/redis_ha/redis# vi redis.confbind 192.168.0.3
requirepass somepassword
masterauth somepassword
protected-mode yes
port 6379
slaveof 192.168.0.2 6379# Small tuning
tcp-keepalive 0
maxmemory 5gb
maxmemory-policy volatile-lru

Please mind that passwords and ports must be the same for all three Redis configurations, in fact, the rest of the two nodes will have only a few differences in configuration with the master node, the slave mode enabled parameter, and the bind IP address parameter.

OK and now the third Redis node config:

node3# cd /opt/redis_ha/redis
node3:/opt/redis_ha/redis# vi redis.confbind 192.168.0.4
requirepass somepassword
masterauth somepassword
protected-mode yes
port 6379
slaveof 192.168.0.2 6379# Small tuning
tcp-keepalive 0
maxmemory 5gb
maxmemory-policy volatile-lru

We finished with the Redis configuration and now we need to create three Sentinel configs, one per node. The Sentinel for the proper work should consist of three nodes minimum, for having a quorum when one node will go down.

In the same /opt/redis_ha/redis directory on each node please create a sentinel.conf file with these parameters:

node1# cd /opt/redis_ha/redis
node1:/opt/redis_ha/redis# vi sentinel.confport 26379
sentinel monitor master 192.168.0.2 6379 2
sentinel auth-pass master somepassword
sentinel down-after-milliseconds master1 3000
sentinel failover-timeout master 6000
sentinel parallel-syncs master 1
protected-mode no

Repeat this on the other two nodes, you must have absolutely the same sentinel.conf on all of them.

More detailed information about these parameters you can found in the Redis Sentinel documentation.

The master parameter name you can choose any, but mind that this name must be the same for all three Sentinel configs. Also in our configuration, we set up the quorum of two nodes for the election of a new master. Then we set up the timeouts to make a decision about the master node failure.

Also, we need to make the directory with the Redis configs writable for the Sentinel service. Sentinel will need to rewrite the Redis and own configuration files for the cluster reconfiguration purpose after possible issues with the master node.

On all of the three nodes run:

node1# chmod -R 777 /opt/redis_ha/redis
node2# chmod -R 777 /opt/redis_ha/redis
node3# chmod -R 777 /opt/redis_ha/redis

After we finished with the Redis and Sentinel configs, we need to setup a few sysctl commands on all three nodes, these parameters will allow the HAProxy service to run on a floating IP that not assigned to the specific host right now.

Add the next few lines to the end of the /etc/sysctl.conf file on your servers:

node1# vi /etc/sysctl.confnet.ipv4.ip_forward=1
net.ipv4.ip_nonlocal_bind=1node2# vi /etc/sysctl.confnet.ipv4.ip_forward=1
net.ipv4.ip_nonlocal_bind=1node3# vi /etc/sysctl.confnet.ipv4.ip_forward=1
net.ipv4.ip_nonlocal_bind=1

And run this command after:

node1# sysctl -p
node2# sysctl -p
node3# sysctl -p

OK, let’s create the configuration for HAProxy services, all three configs will be absolutely the same for all nodes:

node1:/opt/redis_ha/haproxy# vi haproxy.cfgdefaults REDIS 
mode tcp  
timeout connect 3s  
timeout server 3s  
timeout client 3s 
 
frontend ft_redis 
mode tcp 
bind 192.168.0.1:6380 
default_backend bk_redis 
  
  
backend bk_redis 
    mode tcp 
    option tcp-check 
    tcp-check send AUTH\ somepassword\r\n 
    tcp-check expect string +OK 
    tcp-check send PING\r\n 
    tcp-check expect string +PONG 
    tcp-check send info\ replication\r\n 
    tcp-check expect string role:master 
    tcp-check send QUIT\r\n 
    tcp-check expect string +OK 
    server redis_node1 192.168.0.2:6379 maxconn 4096 check inter 3s 
    server redis_node2 192.168.0.3:6379 maxconn 4096 check inter 3s 
    server redis_node3 192.168.0.4:6379 maxconn 4096 check inter 3s

As I mentioned previously the HAProxy service will run on all nodes, but only one of it will serve the requests, in the case when the floating IP will be assigned to this node by Keepalived.

As you know only the Redis master node can accept write requests, so our HAProxy will check which node has the master status now, and redirect all requests only to this node. Also, mind that we expose the 6380 port as the main port of our Redis cluster, so all requests from the applications must be sent to that port and not to the 6379.

If accidentally the master node will go offline and Sentinel will elect and reconfigure one of the slave nodes to the master state, the HAProxy will detect this and starts sending requests to the new master then.

Now we need to repeat with creating the same HAProxy configuration files on the other two nodes.

After we finished creating the configuration files on all three nodes, we actually ready to start the main part of the cluster.

For doing this please cd into the /opt/redis_ha directory and start the services with docker-compose up -d command, first on node1 and then on node2 and node3:

node1:/opt/redis_ha/# docker-compose up -d
Creating redis 
Creating haproxy 
Creating sentinelnode2:/opt/redis_ha/# docker-compose up -d
Creating redis 
Creating haproxy 
Creating sentinelnode3:/opt/redis_ha/# docker-compose up -d
Creating redis 
Creating haproxy 
Creating sentinel

Now check that all three services have the UP state on all cluster nodes:

node1:/opt/redis_ha/ docker ps  
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS              PORTS               NAMES 
85116fdabe24        haproxy:2.3         "docker-entrypoint.s…"   About a minute ago   Up About a minute                       haproxy 
c52354fab87e        redis:latest        "docker-entrypoint.s…"   About a minute ago   Up About a minute                       sentinel 
da32da2e0566        redis:latest        "docker-entrypoint.s…"   About a minute ago   Up About a minute                       redisnode2:/opt/redis_ha/ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS              PORTS               NAMES 
d6c415df3b48        haproxy:2.3         "docker-entrypoint.s…"   About a minute ago   Up 58 seconds                           haproxy 
a55fe4770a33        redis:latest        "docker-entrypoint.s…"   About a minute ago   Up 58 seconds                           sentinel 
cc8bf8c2e836        redis:latest        "docker-entrypoint.s…"   About a minute ago   Up About a minute                       redisnode3:/opt/redis_ha/ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES 
4dded73ef806        redis:latest        "docker-entrypoint.s…"   58 seconds ago      Up 56 seconds                           sentinel 
35b827360329        haproxy:2.3         "docker-entrypoint.s…"   58 seconds ago      Up 56 seconds                           haproxy 
e46cf2ef4053        redis:latest        "docker-entrypoint.s…"   59 seconds ago      Up 58 seconds                           redis

After we were convinced that all containers were started and working OK, we can check the logs for the Redis and Sentinel containers to be sure that our master-slave configuration working as expected.

Check the Redis container logs on the first node that we made as a master node from the start:

node1# docker logs redis  
1:C 06 Mar 2021 09:55:32.638 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 
1:C 06 Mar 2021 09:55:32.638 # Redis version=6.2.0, bits=64, commit=00000000, modified=0, pid=1, just started 
1:C 06 Mar 2021 09:55:32.638 # Configuration loaded 
1:M 06 Mar 2021 09:55:32.639 * monotonic clock: POSIX clock_gettime 
1:M 06 Mar 2021 09:55:32.640 * Running mode=standalone, port=6379. 
1:M 06 Mar 2021 09:55:32.640 # Server initialized 
1:M 06 Mar 2021 09:55:32.641 * Loading RDB produced by version 6.2.0 
1:M 06 Mar 2021 09:55:32.642 * RDB age 234746 seconds 
1:M 06 Mar 2021 09:55:32.642 * RDB memory usage when created 1.96 Mb 
1:M 06 Mar 2021 09:55:32.642 * DB loaded from disk: 0.001 seconds 
1:M 06 Mar 2021 09:55:32.642 * Ready to accept connections 
1:M 06 Mar 2021 09:55:57.701 * Replica 192.168.0.3:6379 asks for synchronization 
1:M 06 Mar 2021 09:55:57.701 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'f0542da5093ec0603f4896e0a84b06d1bd2642fc', my replication IDs are '7eb1071
312fc229c2b49be83d09c47aaff7f08ec' and '0000000000000000000000000000000000000000') 
1:M 06 Mar 2021 09:55:57.701 * Replication backlog created, my new replication IDs are 'ba07b3ecd299c54b459d17343cdc67b0d9d3cc8a' and '0000000000000000000000000000000000000000' 
1:M 06 Mar 2021 09:55:57.701 * Starting BGSAVE for SYNC with target: disk 
1:M 06 Mar 2021 09:55:57.702 * Background saving started by pid 18 
18:C 06 Mar 2021 09:55:57.788 * DB saved on disk 
18:C 06 Mar 2021 09:55:57.789 * RDB: 0 MB of memory used by copy-on-write 
1:M 06 Mar 2021 09:55:57.792 * Background saving terminated with success 
1:M 06 Mar 2021 09:55:57.792 * Synchronization with replica 192.168.0.3:6379 succeeded 
1:M 06 Mar 2021 09:56:04.156 * Replica 192.168.0.4:6379 asks for synchronization 
1:M 06 Mar 2021 09:56:04.156 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'f0542da5093ec0603f4896e0a84b06d1bd2642fc', my replication IDs are 'ba07b3e
cd299c54b459d17343cdc67b0d9d3cc8a' and '0000000000000000000000000000000000000000') 
1:M 06 Mar 2021 09:56:04.156 * Starting BGSAVE for SYNC with target: disk 
1:M 06 Mar 2021 09:56:04.156 * Background saving started by pid 19 
19:C 06 Mar 2021 09:56:04.225 * DB saved on disk 
19:C 06 Mar 2021 09:56:04.226 * RDB: 0 MB of memory used by copy-on-write 
1:M 06 Mar 2021 09:56:04.304 * Background saving terminated with success 
1:M 06 Mar 2021 09:56:04.304 * Synchronization with replica 192.168.0.4:6379 succeeded

As we can see, the master Redis service started successfully and also we see that the other two Redis slave replicas were connected and synced with the master node.

We can check the Sentinel logs on the first node also:

node1# docker logs sentinel  1:X 06 Mar 2021 09:55:33.907 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 
1:X 06 Mar 2021 09:55:33.907 # Redis version=6.2.0, bits=64, commit=00000000, modified=0, pid=1, just started 
1:X 06 Mar 2021 09:55:33.907 # Configuration loaded 
1:X 06 Mar 2021 09:55:33.909 * monotonic clock: POSIX clock_gettime 
1:X 06 Mar 2021 09:55:33.910 * Running mode=sentinel, port=26379. 
1:X 06 Mar 2021 09:55:34.208 # Sentinel ID is 4b409d62a030bdab6f78c039a5d54be49f02981c 
1:X 06 Mar 2021 09:55:34.208 # +monitor master master1 192.168.0.2 6379 quorum 2 
1:X 06 Mar 2021 09:56:01.529 * +sentinel sentinel 77afe8d103d5218f37f791fa2b2176c36e1c077c 192.168.0.3 26379 @ master1 192.168.0.2 6379 
1:X 06 Mar 2021 09:56:04.373 * +slave slave 192.168.0.3:6379 192.168.0.3 6379 @ master1 192.168.0.2 6379 
1:X 06 Mar 2021 09:56:04.426 * +slave slave 192.168.0.4:6379 192.168.0.4 6379 @ master1 192.168.0.2 6379 
1:X 06 Mar 2021 09:56:07.785 * +sentinel sentinel e707cf1b630d018e7fb57aa1b2adc9fd67d64859 192.168.0.4 26379 @ master1 192.168.0.2 6379

There we can find information about which master node now active, information about the quorum of the two Sentinel that confirms the health and status of the 192.168.0.2 master node, and also information about the two slave nodes.

At this stage, we have the working master-slave Redis cluster that covered with the HAProxy service as a load balancer and fails checks service. As we mentioned previously the HAPoxy will take care of checking and sending the requests to the Redis master only and will tracking which master now active. All the three HAProxy services have the same configuration and any of them can be used as a load balancer, if one of it will go down we can start using the next one. And for this purpose, we’ll add the Keepalived service, which will create a virtual floating IP address, that will be able to migrate between nodes of the cluster, if the HAProxy service or even all node will go offline. If this will happens, the virtual IP =192.168.0.1 will be taken by another Keepalived with the smaller priority and placed on his node network interface, so all requests will reach the HAProxy on that node and the cluster will keep working.

The Keepalived we will install as a system package, on all three nodes.

You can also use Docker image for it, but then you’ll need to take care and configure the systems on nodes in specific way. Also, if you use the cloud providers for this setup, you’ll need to use internal LB solution instead of Keepalived.

node1# apt-get update && apt install -y keepalived
node2# apt-get update && apt install -y keepalived
node3# apt-get update && apt install -y keepalived

Then create the Keppalived configuration on the first node, we’ll assign the floating virtual IP on that node by default, also this node will be the Redis master as you remember.

node1# vi /etc/keepalived/keepalived.confvrrp_script chk_haproxy {
    script "/bin/pidof haproxy"
    interval 1
}vrrp_instance VIP_1 {
    state MASTER
    interface eno1
    virtual_router_id 101
    priority 101
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass some-password
    }
    virtual_ipaddress {
        192.168.0.1    
    }
    track_script {
        chk_haproxy
    }
}

The interesting part of this config it’s the vrrp_script part, this is the script that we’ll use for checking the HAProxy service also paced on that node. If the HAProxy will go down in any cause, the Keepalived will migrate the virtual IP to the next node, and our cluster will continue handling requests continuously. In combination with configured checks of the Redis nodes in the HAProxy, we’ll get a very stable and fail tolerance configuration.

OK, the last thing we need to do, create a few Keepalived configs more.

node2# vi /etc/keepalived/keepalived.confvrrp_script chk_haproxy {
    script "/bin/pidof haproxy"
    interval 1
}vrrp_instance VIP_1 {
    state MASTER
    interface eno1
    virtual_router_id 101
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass some-password
    }
    virtual_ipaddress {
        192.168.0.1    
    }
    track_script {
        chk_haproxy
    }
}

And on the third node.

node3# vi /etc/keepalived/keepalived.confvrrp_script chk_haproxy {
    script "/bin/pidof haproxy"
    interval 1
}vrrp_instance VIP_1 {
    state MASTER
    interface eno1
    virtual_router_id 101
    priority 99
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass some-password
    }
    virtual_ipaddress {
        192.168.0.1    
    }
    track_script {
        chk_haproxy
    }
}

Now start the Keepalived service:

node1# systemctl enable keepalived
       systemctl start keepalivednode2# systemctl enable keepalived
       systemctl start keepalivednode3# systemctl enable keepalived
       systemctl start keepalived

Make sure that Keepalived services started OK, then you must find the virtual IP assigned to the main network interface of node1.

node1# ip a2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 
    link/ether 0b:n4:6s:7d:1b:0e brd ff:ff:ff:ff:ff:ff 
    inet 192.168.0.2/24 brd 192.168.0.255 scope global eno1 
       valid_lft forever preferred_lft forever 
    inet 192.168.0.1/32 scope global eno1 
       valid_lft forever preferred_lft forever

Finally, we just created the fail tolerance and high-available Redis installation, with the one master and two slave nodes. This setup will continue working even if you’ll totally lose the master node. The Sentinel service will take care and reconfigure one of the Redis slaves to become a master node, and the Keepalived+HAProxy will continue sending the requests on this new master with no data loss and other issues.

It’s time for a few tests of a new Redis setup, for this you can run a Redis CLI console on any node with docker, and put a few test key:values into the database:

laptop# docker run -it --rm redis redis-cli -a somepassword -h 192.168.0.1 -p 6380 192.168.0.1:6380> set key1 value1 
OK

Now you can go to any of the Redis slaves and check that this key was also replicated on it.

laptop# docker run -it --rm redis redis-cli -a somepassword -h 192.168.0.3 -p 6379192.168.0.3:6379> get key1 
"value1"

Great, also this key will exist on the third node too, now let’s check the stability of our cluster by stopping all Redis and Sentinel services on the master node.

Switch to node1 and stop the docker containers, but first, open the Redis CLI console to the cluster virtual IP from the laptop:

laptop# docker run -it --rm redis redis-cli -a somepassword -h 192.168.0.1 -p 6380192.168.0.1:6380> info.. .. some redis info bla bla bla ..# Replication 
role:master 
connected_slaves:2 
slave0:ip=192.168.0.3,port=6379,state=online,offset=44055169,lag=1 
slave1:ip=192.168.0.4,port=6379,state=online,offset=44055029,lag=1.. .. ...

And while the Redis CLI console open, go and stop all the needed docker containers on the node1:

node1:/opt/redis_ha/# docker-compose down
docker-compose down 
Stopping haproxy ... done 
Stopping sentinel ... done 
Stopping redis ... done 
Removing haproxy ... done 
Removing sentinel ... done 
Removing redis ... done

It’s will stop and remove all docker containers from that node, and this means we just lose the Redis master node with the Sentinel and HAProxy services also.

If you’ll check the laptop, you’ll see that the Redis CLI console still open and working, this means the Keepalived and HAProxy on the other alive node started serving the cluster requests. The time of switching to another node was so small that our CLI was not disconnected at all.

Now we can check do we have our test key1 still available:

192.168.0.1:6380> get key1 
"value1"

Great, we don’t lose any data, but can we still can write into the cluster?

192.168.0.1:6380> set key2 value2
OK

Yep, we still have a fully worked cluster, but now only with two nodes.

You can check the logs for the Redis and Sentinel containers on node2 and node3, and you’ll find that Santinel reconfigured one of the slave nodes into the master state. Also, you can find that the second slave node now using this new master for replication also. The virtual IP will be migrated to node2, as the HAProxy on node1 stopped.

I will save this article from unnecessary text and will skip this logs output here.

But what happens when our old Redis master will come back? Nothing terrible and all, the Sentinel will put this old master into the cluster as a new slave, so the cluster will continue working with the three nodes again.

Now let’s start the docker containers on the node1 again:

node1:/opt/redis_ha/# docker-compose up -d
Creating redis 
Creating sentinel 
Creating haproxy

And then check the replication status on the previously opened Redis CLI console at the laptop:

192.168.0.1:6380> info.. .. some redis info bla bla bla ..# Replication 
role:master 
connected_slaves:2 
slave0:ip=192.168.0.4,port=6379,state=online,offset=44263458,lag=0 
slave1:ip=192.168.0.2,port=6379,state=online,offset=44263458,lag=0
.....

That is, you can see that the Redis from node1 that was master previously, now added as a slave. Please reconnect to the Redis on node1 from the laptop and try to get the key2 that was added when this node was down.

laptop# docker run -it --rm redis redis-cli -a somepassword -h 192.168.0.2 -p 6379192.168.0.2:6379> get key2
"value2"

This means the old master was fully replicated with the new master and now runs as a slave.

Finally, we got stable and fail tolerance Redis implementation with the master-slave setup. This cluster will work even when the Redis master node will go offline, or any other node from the cluster. The Keepalived+HAProxy will take care of the networking load balancing and providing with the stability of the connections with the master node in any situation.

You can run the other stress tests with any tools you like and check this.

For example, use the Locust software and this nice article:

Please mind that created configuration will provide you with stability and prevent data loss, but it’ll not provide any real load balancing, as you still have only one Redis master node at one time. This question and also the Redis multi-master clusters set up using the container orchestration tools will be described in the next part of the article.

To be continued…

HA Redis clustering from scratch

Part One

In the first part of this article, we’ll create a simple but really highly available Redis cluster from one master node and two slaves with Redis Sentinel.

Written by Alexey Nizhegolenko