uWSGI, gevent and pymongo 3 threads mayhem

This is a quick heads-up post about a behaviour change when running a gevent based application using the new pymongo 3 driver under uWSGI and its gevent loop.

I was naturally curious about testing this brand new and major update of the python driver for mongoDB so I just played it dumb : update and give a try on our existing code base.

The first thing I noticed instantly is that a vast majority of our applications were suddenly unable to reload gracefully and were force killed by uWSGI after some time !

worker 1 (pid: 9839) is taking too much time to die...NO MERCY !!!

uWSGI’s gevent-wait-for-hub

All our applications must be able to be gracefully reloaded at any time. Some of them are spawning quite a few greenlets on their own so as an added measure of making sure we never loose any running greenlet we use the gevent-wait-for-hub option, which is described as follow :

wait for gevent hub's death instead of the control greenlet

… which does not mean a lot but is explained in a previous uWSGI changelog :

During shutdown only the greenlets spawned by uWSGI are taken in account,
and after all of them are destroyed the process will exit.

This is different from the old approach where the process wait for
ALL the currently available greenlets (and monkeypatched threads).

If you prefer the old behaviour just specify the option gevent-wait-for-hub

pymongo 3

Compared to its previous 2.x versions, one of the overall key aspect of the new pymongo 3 driver is its intensive usage of threads to handle server discovery and connection pools.

Now we can relate this very fact to the gevent-wait-for-hub behaviour explained above :

the process wait for ALL the currently available greenlets
(and monkeypatched threads)

This explained why our applications were hanging until the reload-mercy (force kill) timeout option of uWSGI hit the fan !

conclusion

When using pymongo 3 with the gevent-wait-for-hub option, you have to keep in mind that all of pymongo’s threads (so monkey patched threads) are considered as active greenlets and will thus be waited for termination before uWSGI recycles the worker !

Two options come in mind to handle this properly :

  1. stop using the gevent-wait-for-hub option and change your code to use a gevent pool group to make sure that all of your important greenlets are taken care of when a graceful reload happens (this is how we do it today, the gevent-wait-for-hub option usage was just over protective for us).
  2. modify your code to properly close all your pymongo connections on graceful reloads.

Hope this will save some people the trouble of debugging this 😉

5 thoughts on “uWSGI, gevent and pymongo 3 threads mayhem

  1. A. Jesse Jiryu Davis

    Hi Ultrabug, in PyMongo 3 I use two strategies to try to kill *threads* promptly. First, the threads only weak-reference the MongoClient, and if the MongoClient is garbage-collected the threads die soon after that:

    https://github.com/mongodb/mongo-python-driver/blob/3.0.2/pymongo/monitor.py#L67

    Since there’s no reference cycle, in CPython, deleting or dereferencing a MongoClient should cause threads to die within a few seconds. If they’re in the middle of a slow network operation they might not notice that they’re supposed to die until the network operation completes; if they’re sleeping they’ll awake and die immediately.

    Second, if you want to kill the threads even sooner than that, call MongoClient.close().

    Let me know if you have any questions, and feel free to open a bug at jira.mongodb.org in the “PYTHON” project if you see anything unexpected.

    Reply
    1. ultrabug Post author

      Hi Jesse,

      Well as a matter a fact in the current case explained in this article there is no garbage collection happening since we’re waiting for the active greenlets / threads to end themselves (it’s a join() really).

      So yes that indeed confirms the behaviour described here and your two strategies validate the second point of the conclusion / workaround proposed !

      Cheers mate

      Reply
  2. vineet

    Hello Ultrabug,

    We are also facing the same issue could give us more code or example over view how you fixed it .

    Reply
    1. ultrabug Post author

      Hi vineet,

      You are facing the same issue using uWSGI and the gevent-wait-for-hub option ? If so, I just removed it.

      Or did you mean how I used the gevent pool Group() to handle my running greenlets and exit properly ?

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.