Tag Archives: keepalived

keepalived v1.2.11 & glusterfs v3.4.2

Quick post for two quick bumps related to clustering.

glusterfs-3.4.2

  • quite a lot of bug fixes and improvements
  • contains a backport for libgfapi support for integrating with NFS Ganesha
  • nfs/mount3: fix crash in subdir resolution

keepalived-1.2.11

  • autoconf: better libnl3 detection
  • Fix memory allocation for MD5 digest
  • Quite some nice memory leak fixes on different components
  • vrrp: dont try to load ip_vs module when not needed
  • Pim van den Berg work on libipvs-2.6 to sync with libipvs from ipvsadm 1.27
  • vrrp: extend ip parser to support default and default6
  • vrrp: fix/extend gratuitous ARP handling (multiple people reported issues where MASTER didnt recover properly after outage due to no gratuitous ARP sent)
  • Multiple fixes to¬†genhash
  • vrrp: fix vrrp socket sync while leaving FAULT state (old old bug here)
  • Full changelog here

keepalived v1.2.9

Another release, 3 months after the mighty 1.2.8. It seems like upstream has awaken !

highlights

  • Jonas Johansson fixed VRRP sync group by sending prio 0 when entering FAULT state. This fix will send prio 0 (VRRP_PRIO_STOP) when the VRRP router transists from MASTER to FAULT state. This will make a sync group leave the MASTER state more quickly by notifying the backup router(s) instead of having them to wait for time out.
  • Jonas Johansson fixed VRRP to honor preempt_delay setting on
    startup.
  • Jonas Johansson extended VRRP code for faster sync group
    transition.
  • Some nice bug fixes to unicast mode.

Full changelog here !

Latest cluster releases

Now that I’m back I’ve bumped some of the sys-cluster packages. Users of keepalived will be interested in this since it was more than a year that upstream released a version.

keepalived-1.2.8

This is a big and long awaited one. It features major enhancements, features and bug fixes. The changelog is pretty huge but here are some quick points which I particulary liked (biased view warning) :

  • Revisited the whole code to use posix declaration style
  • Boon Ang fixed comparison of primary IP addresses. If a router in the master state receives an advertisement with priority equal to the local priority, it must also compare the primary IP addresses (RFC 3768, section 6.4.3). The code to handle this was comparing two IP addresses with different byte-ordering, resulting in multiple routers in the master state. This patches resolves the problem by coverting the local primary IP address to network byte order for the comparison.
  • Henrique Mecking fixed memory leak in libipvs
  • Willy Tarreau and Ryan O’Hara add the ability to use VRRP over unicast. Unicast IP addresses may be specified for each VRRP instance with the ‘unicast_peer’ configuration keyword. When a VRRP instance has one or more unicast IP address defined, VRRP advertisements will be sent to each of those addresses. Unicast IP addresses may be either IPv4 or IPv6. If you are planing to use this option, ensure every ip addresses present in unicast_peer configuration block do not belong to the same router/box. Otherwise it will generate duplicate packet at reception point.

crmsh-1.2.6

Many bug fixes with better performances for this release. This is quite impressive, good work upstream !

corosync-2.3.2

This one is about supporting live config reloading and fix high CPU usage when idle. See the release notes.

Soon to come

The resource-agents v3.9.6 and cluster-glue v1.0.12 should be released by their upstream pretty soon, stay tuned.

Using keepalived for a self-balancing cluster

Load balancing traffic between servers can sometimes lead to headaches depending on your topology and budget. Here I’ll discuss how to create a self load balanced cluster of web servers distributing HTTP requests between themselves and serving them at the same time. Yes, this means that you don’t need dedicated load balancers !

I will not go into the details on how to configure your kernel for ipvsadm etc since it’s already covered enough on the web but instead focus on the challenges and subtleties of achieving a load balancing based only on the realservers themselves. I expect you reader have a minimal knowledge of the terms and usage of ipvsadm and keepalived.

The setup

Let’s start with a scheme and some principles explaining our topology.

  • 3 web servers / realservers (you can do the same using 2)
  • Local subnet : 192.168.0.0/24
  • LVS forwarding method : DR (direct routing)
  • LVS scheduler : WRR (you can choose your own)
  • VIP : 192.168.0.254
  • Main interface for VIP : bond0

keepalived_dr

Let’s take a look at what happens as this will explain a lot of why we should configure the servers in a quite special way.

black arrow / serving

  1. the master server (the one who has the VIP) receives a HTTP port connection request
  2. the load balancing scheduler decides he’s the one who’ll serve this request
  3. the local web server handles the request and replies to the client

 blue arrow / direct routing / serving

  1. the master server receives a HTTP port connection request
  2. the load balancing scheduler decides the blue server should handle this request
  3. the HTTP packet is given to the blue server as-this (no modification is made on the packet)
  4. the blue server receives a packet whose destination IP is the VIP but he doesn’t hold the VIP (tricky part)
  5. the blue server’s web server handles the request and replies to the client

IP configuration

Almost all the tricky part lies in what needs to be done in order to solve the point #4 of the blue server example. Since we’re using direct routing, we need to configure all our servers so they accept packets directed to the VIP even if they don’t have it configured on their receiving interface.

The solution is to have the VIP configured on the loopback interface (lo) with a host scope on the keepalived BACKUP servers while it is configured on the main interface (bond0) on the keepalived MASTER server. This is what is usually done when you use pacemaker and ldirectord with IPAddr2 but keepalived does not handle this kind of configuration natively.

We’ll use the notify_master and notify_backup directives of keepalived.conf to handle this :

notify_master /etc/keepalived/to_master.sh
notify_backup /etc/keepalived/to_backup.sh

We’ll discuss a few problems to fix before detailing those scripts.

The ARP problem

Now some of you wise readers will wonder about the ARP cache corruptions which will happen when multiple hosts claim to own the same IP address on the same subnet. Let’s fix this problem now then as the kernel does have a way of handling this properly. Basically we’ll ask the kernel not to advert the server’s MAC address for the VIP on certain conditions using the arp_ignore and arp_announce sysctl.

Add those lines on the sysctl.conf of your servers :

net.ipv4.conf.all.arp_ignore = 3
net.ipv4.conf.all.arp_announce = 2

Read more about those parameters for the detailed explanation of those values.

The IPVS synchronization problem

This is another problem arising from the fact that the load balancers are also acting as realservers. When keepalived starts, it spawns a synchronization process on the master and backup nodes so you load balancers’ IPVS tables stay in sync. This is needed for a fully transparent fail over as it keeps track of the sessions’ persistence so the clients don’t get rebalanced when the master goes down. Well, this is the limitation of our setup : clients’ HTTP sessions served by the master node will fail if he goes down. But note that the same will happen to the other nodes because we have to get rid of this synchronization to get our setup working. The reason is simple : IPVS table sync conflicts with the actual acceptance of the packet by our loopback set up VIP. Both mechanisms can’t coexist together, so you’d better use this setup for stateless (API?) HTTP servers or if you’re okay with this eventuality.

Final configuration

to_master.sh

#!/bin/bash

ip addr del 192.168.0.254/32 dev lo
ipvsadm --restore < /tmp/keepalived.ipvs
  1. drop the VIP from the loopback interface (it will be setup by keepalived on the master interface)
  2. restore the IPVS configuration

to_backup.sh

#!/bin/bash

ip addr add 192.168.0.254/32 scope host dev lo
ipvsadm --save > /tmp/keepalived.ipvs
ipvsadm --clear
  1. add the VIP to the loopback interface, scope host
  2. keep a copy of the IPVS configuration, if we get to be master, we’ll need it back
  3. drop the IPVS local config so it doesn’t conflict with our own web serving

Conclusion

Even if it offers some serious benefits, remember the main limitation of this setup : if the master fails, all sessions of your web servers will be lost. So use it mostly for stateless stuff or if you’re okay with this. My setup and explanations may have some glitches, feel free to correct me if I’m wrong somewhere.