We’ve been running quite a lot of production clusters using pacemaker/corosync for a while. Some of them are large, handling more than 200 resources across multiple nodes and we’ve exceeded some limits on pacemaker’s CIB size.
I thought I’d share how to tune your cluster to handle such a bunch of resources since there are some default limits on the IPC buffer size which can lead to problems when your resources (and thus CIB) grows too much.
Hitting the IPC limit
When running a large cluster you may hit the following problem :
error: crm_ipc_prepare: Could not compress the message into less than the configured ipc limit (51200 bytes).Set PCMK_ipc_buffer to a higher value (2071644 bytes suggested)
Evaluating the buffer size
Have a look at the size of your current CIB :
# cibadmin -Ql > cib.xml
# ls -l cib.xml
# bzip2 cib.xml
# ls -l cib.xml.bz2
The CIB is compressed on the wire using bzip2 so you have to compare the compressed cib.xml.bz2 with the IPC default buffer size of 51200 and you’ll find the sufficient PCMK_ipc_buffer value for you (take more just to be safe).
Setting the environment variables
On Gentoo Linux, you’ll have to create the /etc/env.d/90pacemaker file containing :
- PCMK_ipc_buffer : you may need to increase this depending on your cluster size and needs
- PCMK_ipc_type : the shared-mem one is the default now, other values are socket|posix|sysv
You will also need to set these env. vars in your .bashrc so that the crm CLI doesn’t break :
Finally, I wanted to let you know that the upcoming Pacemaker v1.1.11 should come with a feature which will allow the IPC layer to adjust the PCMK_ipc_buffer automagically !
Hopefully you shouldn’t need this blog post anymore pretty soon 🙂
EDIT, Jan 16 2014
Following this blog post, I had a very interesting comment from @beekhof (lead dev of pacemaker)
beekhof> Ultrabug: regarding large clusters, the cib in 1.1.12 will be O(2) faster than 1.1.11.
Ultrabug> beekhof: that's great news mate ! when is it scheduled to be released ?
beekhof> 30th of Feb