MongoDB 3.0 upgrade in production : step 3 hope

In our previous attempt to upgrade our production cluster to 3.0, we had to roll back from the WiredTiger engine on primary servers.

Since then, we switched back our whole cluster to 3.0 MMAPv1 which has brought us some better performances than 2.6 with no instability.

Production checklist

We decided to use this increase in performance to allow us some time to fulfil the entire production checklist from MongoDB, especially the migration to XFS. We’re slowly upgrading our servers kernels and resynchronising our data set after migrating from ext4 to XFS.

Ironically, the strong recommendation of XFS in the production checklist appeared 3 days after our failed attempt at WiredTiger… This is frustrating but gives some kind of hope.

I’ll keep on posting on our next steps and results.

Our hero WiredTiger Replica Set

While we were battling with our production cluster, we got a spontaneous major increase in the daily volumes from another platform which was running on a single Replica Set. This application is write intensive and very disk I/O bound. We were killing the disk I/O with almost a continuous 100% usage on the disk write queue.

Despite our frustration with WiredTiger so far, we decided to give it a chance considering that this time we were talking about a single Replica Set. We were very happy to see WiredTiger keep up to its promises with an almost shocking serenity.

Disk I/O went down dramatically, almost as if nothing was happening any more. Compression did magic on our disk usage and our application went Roarrr !

MongoDB 3.0 upgrade in production : step 2 failed

In my previous post regarding the migration of our production cluster to mongoDB 3.0 WiredTiger, we successfully upgraded all the secondaries of our replica-sets with decent performances and (almost, read on) no breakage.

Step 2 plan

The next step of our migration was to test our work load on WiredTiger primaries. After all, this is where the new engine would finally demonstrate all its capabilities.

  • We thus scheduled a step down from our 3.0 MMAPv1 primary servers so that our WiredTiger secondaries would take over.
  • Not migrating the primaries was a safety net in case something went wrong… And boy it went so wrong we’re glad we played it safe that way !
  • We rolled back after 10 minutes of utter bitterness.

The failure

After all the wait and expectation, I can’t express our level of disappointment at work when we saw that the WiredTiger engine could not handle our work load. Our application started immediately to throw 50 to 250 WriteConflict errors per minute !

Turns out that we are affected by this bug and that, of course, we’re not the only ones. So far it seems that it affects collections with :

  • heavy insert / update work loads
  • an unique index (or compound index)

The breakages

We also discovered that we’re affected by a weird mongodump new behaviour where the dumped BSON file does not contain the number of documents that mongodump said it was exporting. This is clearly a new problem because it happened right after all our secondaries switched to WiredTiger.

Since we have to ensure a strong consistency of our exports and that the mongoDB guys don’t seem so keen on moving on the bug (which I surely can understand) there is a large possibility that we’ll have to roll back even the WiredTiger secondaries altogether.

Not to mention that since the 3.0 version, we experience some CPU overloads crashing the entire server on our MMAPv1 primaries that we’re still trying to tackle before opening another JIRA bug…

Sad panda

Of course, any new major release such as 3.0 causes its headaches and brings its lot of bugs. We were ready for this hence the safety steps we took to ensure that we could roll back on any problem.

But as a long time advocate of mongoDB I must admit my frustration, even more after the time it took to get this 3.0 out and all the expectations that came with it.

I hope I can share some better news on the next blog post.

Gevent : SSL support fixed for python 2.7.9

Good news for gevent users blocked on python < 2.7.9 due to broken SSL support since python upstream dropped the private API _ssl.sslwrap that eventlet was using.

This issue was starting to get old and problematic since GLSA 2015-0310 but I’m happy to say that almost 6 hours after the gevent-1.0.2 release, it is already available on portage !

We were also affected by this issue at work so I’m glad that the tension between ops and devs this issue was causing will finally be over ;)

uWSGI, gevent and pymongo 3 threads mayhem

This is a quick heads-up post about a behaviour change when running a gevent based application using the new pymongo 3 driver under uWSGI and its gevent loop.

I was naturally curious about testing this brand new and major update of the python driver for mongoDB so I just played it dumb : update and give a try on our existing code base.

The first thing I noticed instantly is that a vast majority of our applications were suddenly unable to reload gracefully and were force killed by uWSGI after some time !

worker 1 (pid: 9839) is taking too much time to die...NO MERCY !!!

uWSGI’s gevent-wait-for-hub

All our applications must be able to be gracefully reloaded at any time. Some of them are spawning quite a few greenlets on their own so as an added measure of making sure we never loose any running greenlet we use the gevent-wait-for-hub option, which is described as follow :

wait for gevent hub's death instead of the control greenlet

… which does not mean a lot but is explained in a previous uWSGI changelog :

During shutdown only the greenlets spawned by uWSGI are taken in account,
and after all of them are destroyed the process will exit.

This is different from the old approach where the process wait for
ALL the currently available greenlets (and monkeypatched threads).

If you prefer the old behaviour just specify the option gevent-wait-for-hub

pymongo 3

Compared to its previous 2.x versions, one of the overall key aspect of the new pymongo 3 driver is its intensive usage of threads to handle server discovery and connection pools.

Now we can relate this very fact to the gevent-wait-for-hub behaviour explained above :

the process wait for ALL the currently available greenlets
(and monkeypatched threads)

This explained why our applications were hanging until the reload-mercy (force kill) timeout option of uWSGI hit the fan !

conclusion

When using pymongo 3 with the gevent-wait-for-hub option, you have to keep in mind that all of pymongo’s threads (so monkey patched threads) are considered as active greenlets and will thus be waited for termination before uWSGI recycles the worker !

Two options come in mind to handle this properly :

  1. stop using the gevent-wait-for-hub option and change your code to use a gevent pool group to make sure that all of your important greenlets are taken care of when a graceful reload happens (this is how we do it today, the gevent-wait-for-hub option usage was just over protective for us).
  2. modify your code to properly close all your pymongo connections on graceful reloads.

Hope this will save some people the trouble of debugging this ;)

MongoDB 3.0 upgrade in production : first steps

We’ve been running a nice mongoDB cluster in production for several years now in my company.

This cluster suits quite a wide range of use cases from very simple configuration collections to complex queried ones and real time analytics. This versatility has been the strong point of mongoDB for us since the start as it allows different teams to address their different problems using the same technology. We also run some dedicated replica sets for other purposes and network segmentation reasons.

We’ve waited a long time to see the latest 3.0 release features happening. The new WiredTiger storage engine hit the fan at the right time for us since we had reached the limits of our main production cluster and were considering alternatives.

So as surprising it may seem, it’s the first of our mongoDB architecture we’re upgrading to v3.0 as it has become a real necessity.

This post is about sharing our first experience about an ongoing and carefully planned major upgrade of a production cluster and does not claim to be a definitive migration guide.

Upgrade plan and hardware

The upgrade process is well covered in the mongoDB documentation already but I will list the pre-migration base specs of every node of our cluster.

  • mongodb v2.6.8
  • RAID1 spinning HDD 15k rpm for the OS (Gentoo Linux)
  • RAID10 4x SSD for mongoDB files under LVM
  • 64 GB RAM

Our overall philosophy is to keep most of the configuration parameters to their default values to start with. We will start experimenting with them when we have sufficient metrics to compare with later.

Disk (re)partitioning considerations

The master-get-all-the-writes architecture is still one of the main limitation of mongoDB and this does not change with v3.0 so you obviously need to challenge your current disk layout to take advantage of the new WiredTiger engine.

mongoDB 2.6 MMAPv1

Considering our cluster data size, we were forced to use our 4 SSD in a RAID10 as it was the best compromise to preserve performance while providing sufficient data storage capacity.

We often reached the limits of our I/O and moved the journal out of the RAID10 to the mostly idle OS RAID1 with no significant improvements.

mongoDB 3.0 WiredTiger

The main consideration point for us is the new feature allowing to store the indexes in a separate directory. So we anticipated the data storage consumption reduction thanks to the snappy compression and decided to split our RAID10 in two dedicated RAID1.

Our test layout so far is :

  • RAID1 SSD for the data
  • RAID1 SSD for the indexes and journal

Our first node migration

After migrating our mongos and config servers to 3.0, we picked our worst performing secondary node to test the actual migration to WiredTiger. After all, we couldn’t do worse right ?

We are aware that the strong suit of WiredTiger is actually about having the writes directed to it and will surely share our experience of this aspect later.

compression is bliss

To make this comparison accurate, we resynchronized this node totally before migrating to WiredTiger so we could compare a non fragmented MMAPv1 disk usage with the WiredTiger compressed one.

While I can’t disclose the actual values, compression worked like a charm for us with a gain ratio of 3,2 on disk usage (data + indexes) which is way beyond our expectations !

This is the DB Storage graph from MMS, showing a gain ratio of 4 surely due to indexes being in a separate disk now.

2015-05-07-115324_461x184_scrot

 

 

 

 

memory usage

As with the disk usage, the node had been running hot on MMAPv1 before the actual migration so we can compare memory allocation/consumption of both engines.

There again the memory management of WiredTiger and its cache shows great improvement. For now, we left the default setting which has WiredTiger limit its cache to half the available memory of the system. We’ll experiment with this setting later on.

2015-05-07-115347_459x177_scrot

 

 

 

 

connections

This I’m still not sure of the actual cause yet but the connections count is higher and more steady than before on this node.

2015-05-07-123449_454x183_scrot

First impressions

The node is running smooth for several hours now. We are getting acquainted to the new metrics and statistics from WiredTiger. The overall node and I/O load is better than before !

While all the above graphs show huge improvements there is no major change from our applications point of view. We didn’t expect any since this is only one node in a whole cluster and that the main benefits will also come from master node migrations.

I’ll continue to share our experience and progress about our mongoDB 3.0 upgrade.

py3status v2.4

I’m very pleased to announce this new release of py3status because it is by far the most contributed one with a total of 33 files changed, 1625 insertions and 509 deletions !

I’ll start by thanking this release’s contributors with a special mention for Federico Ceratto for his precious insights, his CLI idea and implementation and other modules contributions.

Thank you

  • Federico Ceratto
  • @rixx (and her amazing reactivity)
  • J.M. Dana
  • @Gamonics
  • @guilbep
  • @lujeni
  • @obb
  • @shankargopal
  • @thomas-

IMPORTANT

In order to keep a clean and efficient code base, this is the last version of py3status supporting the legacy modules loading and ordering, this behavior will be dropped on the next 2.5 version !

CLI commands

py3status now supports some CLI commands which allows you to get information about all the available modules and their documentation.

  • list all available modules

if you specify your own inclusion folder(s) with the -i parameter, your modules will be listed too !

$ py3status modules list
Available modules:
  battery_level          Display the battery level.
  bitcoin_price          Display bitcoin prices using bitcoincharts.com.
  bluetooth              Display bluetooth status.
  clementine             Display the current "artist - title" playing in Clementine.
  dpms                   Activate or deactivate DPMS and screen blanking.
  glpi                   Display the total number of open tickets from GLPI.
  imap                   Display the unread messages count from your IMAP account.
  keyboard_layout        Display the current keyboard layout.
  mpd_status             Display information from mpd.
  net_rate               Display the current network transfer rate.
  netdata                Display network speed and bandwidth usage.
  ns_checker             Display DNS resolution success on a configured domain.
  online_status          Display if a connection to the internet is established.
  pingdom                Display the latest response time of the configured Pingdom checks.
  player_control         Control music/video players.
  pomodoro               Display and control a Pomodoro countdown.
  scratchpad_counter     Display the amount of windows in your i3 scratchpad.
  spaceapi               Display if your favorite hackerspace is open or not.
  spotify                Display information about the current song playing on Spotify.
  sysdata                Display system RAM and CPU utilization.
  vnstat                 Display vnstat statistics.
  weather_yahoo          Display Yahoo! Weather forecast as icons.
  whoami                 Display the currently logged in user.
  window_title           Display the current window title.
  xrandr                 Control your screen(s) layout easily.
  • get available modules details and configuration
$ py3status modules details
Available modules:
  battery_level          Display the battery level.
                         
                         Configuration parameters:
                             - color_* : None means - get it from i3status config
                             - format : text with "text" mode. percentage with % replaces {}
                             - hide_when_full : hide any information when battery is fully charged
                             - mode : for primitive-one-char bar, or "text" for text percentage ouput
                         
                         Requires:
                             - the 'acpi' command line
                         
                         @author shadowprince, AdamBSteele
                         @license Eclipse Public License
                         ---
[...]

Modules changelog

  • new bluetooth module by J.M. Dana
  • new online_status module by @obb
  • new player_control module, by Federico Ceratto
  • new spotify module, by Pierre Guilbert
  • new xrandr module to handle your screens layout from your bar
  • dpms module activate/deactivate the screensaver as well
  • imap module various configuration and optimizations
  • pomodoro module can use DBUS notify, play sounds and be paused
  • spaceapi module bugfix for space APIs without ‘lastchange’ field
  • keyboard_layout module incorrect parsing of “setxkbmap -query”
  • battery_level module better python3 compatibility

Other highlights

Full changelog here.

  • catch daylight savings time change
  • ensure modules methods are always iterated alphabetically
  • refactor default config file detection
  • rename and move the empty_class example module to the doc/ folder
  • remove obsolete i3bar_click_events module
  • py3status will soon be available on debian thx to Federico Ceratto !

mongoDB 3.0.1

This is a quite awaited version bump coming to portage and I’m glad to announce it’s made its way to the tree today !

I’ll right away thank a lot Tomas Mozes and Darko Luketic for their amazing help, feedback and patience !

mongodb-3.0.1

I introduced quite some changes in this ebuild which I wanted to share with you and warn you about. MongoDB upstream have stripped quite a bunch of things out of the main mongo core repository which I have in turn split into ebuilds.

Major changes :

  • respect upstream’s optimization flags : unless in debug build, user’s optimization flags will be ignored to prevent crashes and weird behaviour.
  • shared libraries for C/C++ are not built by the core mongo respository anymore, so I removed the static-libs USE flag.
  • various dependencies optimization to trigger a rebuild of mongoDB when one of its linked dependency changes.

app-admin/mongo-tools

The new tools USE flag allows you to pull a new ebuild named app-admin/mongo-tools which installs the commands listed below. Obviously, you can now just install this package if you only need those tools on your machine.

  • mongodump / mongorestore
  • mongoexport / mongoimport
  • mongotop
  • mongofiles
  • mongooplog
  • mongostat
  • bsondump

app-admin/mms-agent

The MMS agent has now some real version numbers and I don’t have to host their source on Gentoo’s infra woodpecker. At the moment there is only the monitoring agent available, shall anyone request the backup one, I’ll be glad to add its support too.

dev-libs/mongo-c(xx)-driver

I took this opportunity to add the dev-libs/mongo-cxx-driver to the tree and bump the mongo-c-driver one. Thank you to Balint SZENTE for his insight on this.

mongoDB 2.6.8, 2.4.13 & the upcoming 3.0.0

I’m a bit slacking on those follow-up posts but with the upcoming mongoDB 3.x series and the recent new releases I guess it was about time I talked a bit about what was going on.

 mongodb-3.0.0_rcX

Thanks to the help of Tomas Mozes, we might get a release candidate version of the 3.0.0 version of mongoDB pretty soon in tree shall you want to test it on Gentoo. Feel free to contribute or give feedback in the bug, I’ll do my best to keep up.

What Tomas proposes matches what I had in mind so for now the plan is to :

  • split the mongo tools (mongodump/export etc) to a new package : dev-db/mongo-tools or app-admin/mongo-tools ?
  • split the MMS monitoring agent to its own package : app-admin/mms-monitoring-agent
  • have a look at the MMS backup agent and maybe propose its own package if someone is interested in this ?
  • after the first release, have a look at the MMS deployment automation to see how it could integrate with Gentoo

mongodb-2.6.8 & 2.4.13

Released 2 days ago, they are already on portage ! The 2.4.13 is mostly a security (SSL v3) and tiny backport release whereas the 2.6.8 fixes quite a bunch of bugs.

Please note that I will drop the 2.4.x releases when 3.0.0 hits the tree ! I will keep the latest 2.4.13 in my overlay if someone asks for it.

py3status v2.0

I’m very pleased to announce the release of py3status v2.0 which I’d like to dedicate to the person who’s behind all the nice improvements this release features : @tablet-mode !

His idea on issue #44 was to make py3status modules configurable. After some thoughts and merges of my own plans of development, we ended up with what I believe are the most ambitious features py3status provides so far.

Features

The logic behind this release is that py3status now wraps and extends your i3status.conf which allows all the following crazy features :

For all your i3bar modules i3status and py3status alike thanks to the new on_click parameter which you can use like any other i3status.conf parameter on all modules. It has never been so easy to handle click events !

This is a quick and small example of what it looks like :

# run thunar when I left click on the / disk info module
disk / {
    format = "/ %free"
    on_click 1 = "exec thunar /"
}
  • All py3status contributed modules are now shipped and usable directly without the need to copy them to your local folder. They also get to be configurable directly from your i3status config (see below)

No need to copy and edit the contributed py3status modules you like and wish to use, you can now load and configure them directly from your i3status.conf.

All py3status modules (contributed ones and user loaded ones) are now loaded and ordered using the usual syntax order += in your i3status.conf !

  • All modules have been improved, cleaned up and some of them got some love from contributors.
  • Every click event now triggers a refresh of the clicked module, even for i3status modules. This makes your i3bar more responsive than ever !

Contributors

  • @AdamBSteele
  • @obb
  • @scotte
  • @tablet-mode

Thank you

  • Jakub Jedelsky : py3status is now packaged on Fedora Linux.
  • All of you users : py3status has broken the 100 stars on github, I’m still amazed by this. @Lujeni’s prophecy has come true :)
  • I still have some nice ideas in stock for even more functionalities, stay tuned !