Tag Archives: pacemaker

Tuning pacemaker for large clusters

We’ve been running quite a lot of production clusters using pacemaker/corosync for a while. Some of them are large, handling more than 200 resources across multiple nodes and we’ve exceeded some limits on pacemaker’s CIB size.

I thought I’d share how to tune your cluster to handle such a bunch of resources since there are some default limits on the IPC buffer size which can lead to problems when your resources (and thus CIB) grows too much.

Hitting the IPC limit

When running a large cluster you may hit the following problem :

error: crm_ipc_prepare: Could not compress the message into less than the configured ipc limit (51200 bytes).Set PCMK_ipc_buffer to a higher value (2071644 bytes suggested)

Evaluating the buffer size

Have a look at the size of your current CIB :

# cibadmin -Ql > cib.xml
# ls -l cib.xml
# bzip2 cib.xml
# ls -l cib.xml.bz2

The CIB is compressed on the wire using bzip2 so you have to compare the compressed cib.xml.bz2 with the IPC default buffer size of 51200 and you’ll find the sufficient PCMK_ipc_buffer value for you (take more just to be safe).

Setting the environment variables

On Gentoo Linux, you’ll have to create the /etc/env.d/90pacemaker file containing :

PCMK_ipc_type=shared-mem
PCMK_ipc_buffer=2071644
  • PCMK_ipc_buffer : you may need to increase this depending on your cluster size and needs
  • PCMK_ipc_type : the shared-mem one is the default now, other values are socket|posix|sysv

You will also need to set these env. vars in your .bashrc so that the crm CLI doesn’t break :

export PCMK_ipc_type=shared-mem
export PCMK_ipc_buffer=2071644

Future

Finally, I wanted to let you know that the upcoming Pacemaker v1.1.11 should come with a feature which will allow the IPC layer to adjust the PCMK_ipc_buffer automagically !

Hopefully you shouldn’t need this blog post anymore pretty soon 🙂

EDIT, Jan 16 2014

Following this blog post, I had a very interesting comment from @beekhof (lead dev of pacemaker)

beekhof> Ultrabug: regarding large clusters, the cib in 1.1.12 will be O(2) faster than 1.1.11.
Ultrabug> beekhof: that's great news mate ! when is it scheduled to be released ?
beekhof> 30th of Feb

pacemaker v1.1.10 & corosync v2.3.1

More than 5 months since the last bump of pacemaker. I’m glad that @beekhof did release the final pacemaker-1.1.10 and that the officially stable corosync got bumped to 2.3.1.

The changelogs are quite heavy so I won’t go into details about them but they both have quite a nice bunch of bugfixes and compatibility features. That’s why I’m hoping we should soon be able to fix bug #429416 and drop corosync hard mask. Hopefully some users such as @pvsa will give us some valuable feedback which will allow us to do it smoothly.

changelog

mongoDB and Pacemaker recent bumps

mongoDB 2.4.3

Yet another bugfix release, this new stable branch is surely one of the most quickly iterated I’ve ever seen. I guess we’ll wait a bit longer at work before migrating to 2.4.x.

pacemaker 1.1.10_rc1

This is the release of pacemaker we’ve been waiting for, fixing among other things, the ACL problem which was introduced in 1.1.9. Andrew and others are working hard to get a proper 1.1.10 out soon, thanks guys.

Meanwhile, we (gentoo cluster herd) have been contacted by @Psi-Jack who has offered his help to follow and keep some of our precious clustering packages up to date, I wish our work together will benefit everyone !

All of this is live on portage, enjoy.

Pacemaker vulnerability and v1.1.9 release

A security vulnerability (CVE-2013-0281) was found on pacemaker which permitted attackers to prevent your cluster from serving more CIB requests. Although this issue was quickly fixed by upstream, they didn’t add a new tag to pacemaker so I did ask Andrew Beekhof for one so I could take care of bug #457572. Gentoo users, here comes pacemaker-1.1.9 !

important

While packaging and testing pacemaker-1.1.9, I ran into some weird permission issues which I debugged with @beekhof and @asalkeld (thx again guys). Turns out that when enabling ACL support on pacemaker, you now need to add root to the haclient group ! The reason is that pacemaker now uses shared memory IPC sockets from libqb to communicate with corosync (on /dev/shm/).

v1.1.9 changelog

  • corosync: Allow cman and corosync 2.0 nodes to use a name other than uname()
  • corosync: Use queues to avoid blocking when sending CPG messages
  • Drop per-user core directories
  • ipc: Compress messages that exceed the configured IPC message limit
  • ipc: Use queues to prevent slow clients from blocking the server
  • ipc: Use shared memory by default
  • lrmd: Support nagios remote monitoring
  • lrmd: Pacemaker Remote Daemon for extending pacemaker functionality outside corosync cluster.
  • pengine: Check for master/slave resources that are not OCF agents
  • pengine: Support a ‘requires’ resource meta-attribute for controlling whether it needs quorum, fencing or nothing
  • pengine: Support for resource container
  • pengine: Support resources that require unfencing before start

Since the main focus of the bump was to fix a security issue, I didn’t add the new nagios feature to the ebuild. If you’re interested in it, just say so and I’ll do my best to add it asap.

Clustering : corosync v1.4.3 & pacemaker v1.1.7 released

I’ve finally taken the time to take care of the corosync and pacemaker ebuilds. The new versions are now available in portage.

Corosync 1.4.3 (10/04/2012)

This is one of the last supported old stable release of the Corosync Cluster Engine. FYI, I’ve also bumped the new corosync-2.0.0 version but it needs more testing before I hard-unmask it.

Pacemaker 1.1.7 (28/03/12)

This is a bug fix release of Pacemaker. See the changelog for details.

Special thanks to my fellow Gentoo Linux developer Kacper Kowalik (xarthisius) for his help on these bumps.