RethinkDB on Gentoo Linux

2014-11-05-cat-instagram

It was about time I added a new package to portage and I’m very glad it to be RethinkDB and its python driver !

  • dev-db/rethinkdb
  • dev-python/python-rethinkdb

For those of you who never heard about this database, I urge you to go about their excellent website and have a good read.

Packaging RethinkDB

RethinkDB has been under my radar for quite a long time now and when they finally got serious enough about high availability I also got serious about using it at work… and obviously “getting serious” + “work” means packaging it for Gentoo Linux 🙂

Quick notes on packaging for Gentoo Linux:

  • This is a C++ project so it feels natural and easy to grasp
  • The configure script already offers a way of using system libraries instead of the bundled ones which is in line with Gentoo’s QA policy
  • The only grey zone about the above statement is the web UI which is used precompiled

RethinkDB has a few QA violations which the ebuild is addressing by modifying the sources:

  • There is a configure.default which tries to force some configure options
  • The configure is missing some options to avoid auto installing some docs and init scripts
  • The build system does its best to guess the CXX compiler but it should offer an option to set it directly
  • The build system does not respect users’ CXXFLAGS and tries to force the usage of -03

Getting our hands into RethinkDB

At work, we finally found the excuse to get our hands into RethinkDB when we challenged ourselves with developing a quizz game for our booth as a sponsor of Europython 2016.

It was a simple game where you were presented a question and four possible answers and you had 60 seconds to answer as much of them as you could. The trick is that we wanted to create an interactive game where the participant had to play on a tablet but the rest of the audience got to see who was currently playing and follow their score progression + their ranking for the day and the week in real time on another screen !

Another challenge for us in the creation of this game is that we only used technologies that were new to us and even switched jobs so the backend python guys would be doing the frontend javascript et vice et versa. The stack finally went like this :

  • Game quizz frontend : Angular2 (TypeScript)
  • Game questions API : Go
  • Real time scores frontend : Angular2 + autobahn
  • Real time scores API : python 3.5 asyncio + autobahn
  • Database : RethinkDB

As you can see on the stack we chose RethinkDB for its main strength : real time updates pushed to the connected clients. The real time scores frontend and API were bonded together using autobahn while the API was using the changefeeds (realtime updates coming from the database) and broadcasting them to the frontend.

What we learnt about RethinkDB

  • We’re sure that we want to use it in production !
  • The ReQL query language is a pipeline so its syntax is quite tricky to get familiar with (even more when coming from mongoDB like us), it is as powerful as it can be disconcerting
  • Realtime changefeeds have limitations which are sometimes not so easy to understand/find out (especially the order_by / secondary index part)
  • Changefeeds limitations is a constraint you have to take into account in your data modeling !
  • Changefeeds + order_by can do the ordering for you when using the include_offsets option, this is amazing
  • The administration web UI is awesome
  • The python 3.5 asyncio proper support is still not merged, this is a pain !

Try it out

Now that you can emerge rethinkdb I encourage you to try this awesome database.

Be advised that the ebuild also provides a way of configuring your rethinkdb instance by running emerge –config dev-db/rethinkdb !

I’ll now try to get in touch with upstream to get Gentoo listed on their website.

py3status v3.0

Oh boy, this new version is so amazing in terms of improvements and contributions that it’s hard to sum it up !

Before going into more explanations I want to dedicate this release to tobes whose contributions, hard work and patience have permitted this ambitious 3.0 : THANK YOU !

This is the graph of contributed commits since 2.9 just so you realise how much this version is thanks to him:
2016-06-25-165245_1289x248_scrotI can’t continue on without also thanking Horgix who started this madness by splitting the code base into modular files and pydsigner for his everlasting contributions and code reviews !

The git stat since 2.9 also speaks for itself:

 73 files changed, 7600 insertions(+), 3406 deletions(-)

So what’s new ?

  • the monolithic code base have been split into modules responsible for the given tasks py3status performs
  • major improvements on modules output orchestration and execution resulting in considerable CPU consumption reduction and i3bar responsiveness
  • refactoring of user notifications with added dbus support and rate limiting
  • improved modules error reporting
  • py3status can now survive an i3status crash and will try to respawn it
  • a new ‘container’ module output type gives the ability to group modules together
  • refactoring of the time and tztime modules support brings the support of all the time macros (%d, %Z etc)
  • support for stopping py3status and its modules when i3bar hide mode is used
  • refactoring of general, contribution and most noticeably modules documentation
  • more details on the rest of the changelog

Modules

Along with a cool list of improvements on the existing modules, these are the new modules:

  • new group module to cycle display of several modules (check it out, it’s insanely handy !)
  • new fedora_updates module to check for your Fedora packages updates
  • new github module to check a github repository and notifications
  • new graphite module to check metrics from graphite
  • new insync module to check your current insync status
  • new timer module to have a simple countdown displayed
  • new twitch_streaming module to check is a Twitch Streamer is online
  • new vpn_status module to check your VPN status
  • new xrandr_rotate module to rotate your screens
  • new yandexdisk_status module to display Yandex.Disk status

Contributors

And of course thank you to all the others who made this version possible !

  • @egeskow
  • Alex Caswell
  • Johannes Karoff
  • Joshua Pratt
  • Maxim Baz
  • Nathan Smith
  • Themistokle Benetatos
  • Vladimir Potapev
  • Yongming Lai

py3status v2.9

py3status v2.9 is out with a good bunch of new modules, exciting improvements and fixes !

Thanks

This release is made of their stuff, thank you contributors !

  • @4iar
  • @AnwariasEu
  • @cornerman
  • Alexandre Bonnetain
  • Alexis ‘Horgix’ Chotard
  • Andrwe Lord Weber
  • Ben Oswald
  • Daniel Foerster
  • Iain Tatch
  • Johannes Karoff
  • Markus Weimar
  • Rail Aliiev
  • Themistokle Benetatos

New modules

  • arch_updates module, by Iain Tatch
  • deadbeef module to show current track playing, by Themistokle Benetatos
  • icinga2 module, by Ben Oswald
  • scratchpad_async module, by johannes karoff
  • wifi module, by Markus Weimar

Fixes and enhancements

  • Rail Aliiev implement flake8 check via travis-ci, we now have a new build-passing badge
  • fix: handle format_time tztime parameter thx to @cornerman, fix issue #177
  • fix: respect ordering of the ipv6 i3status module even on empty configuration, fix #158 as reported by @nazco
  • battery_level module: add multiple battery support, by 4iar
  • battery_level module: added formatting options, by Alexandre Bonnetain
  • battery_level module: added option hide_seconds, by Andrwe Lord Weber
  • dpms module: added color support, by Andrwe Lord Weber
  • spotify module: added format_down option, by Andrwe Lord Weber
  • spotify module: fixed color & playbackstatus check, by Andrwe Lord Weber
  • spotify module: workaround broken dbus, removed PlaybackStatus query, by christian
  • weather_yahoo module: support woeid, add more configuration parameters, by Rail Aliiev

What’s next ?

Some major core enhancements and code clean up are coming up thanks to @cornerman, @Horgix and @pydsigner. The next release will be faster than ever and even less CPU consuming !

Meanwhile, this 2.9 release is available on pypi and Gentoo portage, have fun !

Gentoo Linux on DELL XPS 13 9350

As I found little help about this online I figured I’d write a summary piece about my recent experience in installing Gentoo Linux on a DELL XPS 13 9350.

UEFI or MBR ?

This machine ships with a NVME SSD so don’t think twice : UEFI is the only sane way to go.

BIOS configuration

I advise to use the pre-installed Windows 10 to update the XPS to the latest BIOS (1.1.7 at the time of writing). Then you need to change some stuff to boot and get the NVME SSD disk discovered by the live CD.

  • Turn off Secure Boot
  • Set SATA Operation to AHCI (will break your Windows boot but who cares)

Live CD

Go for the latest SystemRescueCD (it’s Gentoo based, you won’t be lost) as it’s quite more up to date and supports booting on UEFI. Make it a Live USB for example using unetbootin and the ISO on a vfat formatted USB stick.

NVME SSD disk partitioning

We’ll be using GPT with UEFI. I found that using gdisk was the easiest. The disk itself is found on /dev/nvme0n1. Here it is the partition table I used :

  • 500Mo UEFI boot partition (type EF00)
  • 16Go Swap partition
  • 60Go Linux root partition
  • 400Go home partition

The corresponding gdisk commands :

# gdisk /dev/nvme0n1

Command: o ↵
This option deletes all partitions and creates a new protective MBR.
Proceed? (Y/N): y ↵

Command: n ↵
Partition Number: 1 ↵
First sector: ↵
Last sector: +500M ↵
Hex Code: EF00 ↵

Command: n ↵
Partition Number: 2 ↵
First sector: ↵
Last sector: +16G ↵
Hex Code: 8200 ↵

Command: n ↵
Partition Number: 3 ↵
First sector: ↵
Last sector: +60G ↵
Hex Code: ↵

Command: n ↵
Partition Number: 4 ↵
First sector: ↵
Last sector: ↵ (for rest of disk)
Hex Code: ↵

Command: w ↵
Do you want to proceed? (Y/N): Y ↵

No WiFi on Live CD ? no panic

If your live CD is old (pre 4.4 kernel), the integrated broadcom 4350 wifi card won’t be available !

My trick was to use my Android phone connected to my local WiFi as a USB modem which was detected directly by the live CD.

  • get your Android phone connected on your local WiFi (unless you want to use your cellular data)
  • plug in your phone using USB to your XPS
  • on your phone, go to Settings / More / Tethering & portable hotspot
  • enable USB tethering

Running ip addr will show the network card enp0s20f0u2 (for me at least), then if no IP address is set on the card, just ask for one :

# dhcpcd enp0s20f0u2

Et voilà, you have now access to the internet.

Proceed with installation

The only thing to worry about is to format the UEFI boot partition as FAT32.

# mkfs.vfat -F 32 /dev/nvme0n1p1

Then follow the Gentoo handbook as usual for the next steps of the installation process until you arrive to the kernel and the bootloader / grub part.

From this moment I can already say that NO we won’t be using GRUB at all so don’t bother installing it. Why ? Because at the time of writing, the efi-64 support of GRUB was totally not working at all as it failed to discover the NVME SSD disk on boot.

Kernel sources and consideration

The trick here is that we’ll setup the boot ourselves directly from the BIOS later so we only need to build a standalone kernel (meaning able to boot without an initramfs).

EDIT: as of Jan. 10 2016, kernel 4.4 is available on portage so you don’t need the patching below any more !

Make sure you install and use at least a 4.3.x kernel (4.3.3 at the time of writing). Add sys-kernel/gentoo-sources to your /etc/portage/package.keywords file if needed. If you have a 4.4 kernel available, you can skip patching it below.

Patching 4.3.x kernels for Broadcom 4350 WiFi support

To get the broadcom 4350 WiFi card working on 4.3.x, we need to patch the kernel sources. This is very easy to do thanks to Gentoo’s user patches support. Do this before installing gentoo-sources (or reinstall it afterwards).

This example is for gentoo-sources-4.3.3, adjust your version accordingly :

(chroot) # mkdir -p /etc/portage/patches/sys-kernel/gentoo-sources-4.3.3
(chroot) # cd /etc/portage/patches/sys-kernel/gentoo-sources-4.3.3
(chroot) # wget http://ultrabug.fr/gentoo/xps9350/0001-bcm4350.patch

When emerging the gentoo-sources package, you should see the patch being applied. Check that it worked by issuing :

(chroot) # grep BRCM_CC_4350 /usr/src/linux/drivers/net/wireless/brcm80211/brcmfmac/chip.c
case BRCM_CC_4350_CHIP_ID:

The resulting kernel module will be called brcmfmac, make sure to load it on boot by adding it in your /etc/conf.d/modules :

modules="brcmfmac"

EDIT: as of Jan. 7 2016, version 20151207 of linux-firmware ships with the needed files so you don’t need to download those any more !

Then we need to download the WiFi card’s firmware files which are not part of the linux-firmware package at the time of writing (20150012).

(chroot) # emerge '>=sys-kernel/linux-firmware-20151207'

# DO THIS ONLY IF YOU DONT HAVE >=sys-kernel/linux-firmware-20151207 available !
(chroot) # cd /lib/firmware/brcm/
(chroot) # wget http://ultrabug.fr/gentoo/xps9350/BCM-0a5c-6412.hcd
(chroot) # wget http://ultrabug.fr/gentoo/xps9350/brcmfmac4350-pcie.bin

Kernel config & build

I used genkernel to build my kernel. I’ve done a very few adjustments but these are the things to mind in this pre-built kernel :

  • support for NVME SSD added as builtin
  • it is builtin for ext4 only (other FS are not compiled in)
  • support for DM_CRYPT and LUKS ciphers for encrypted /home
  • the root partition is hardcoded in the kernel as /dev/nvme0n1p3 so if yours is different, you’ll need to change CONFIG_CMDLINE and compile it yourself
  • the CONFIG_CMDLINE above is needed because you can’t pass kernel parameters using UEFI so you have to hardcode them in the kernel itself
  • support for the intel graphic card DRM and framebuffer (there’s a kernel bug with skylake CPUs which will spam the logs but it still works good)

Get the kernel config and compile it :

EDIT: updated kernel config to 4.4.4 with SD Card support.

(chroot) # mkdir -p /etc/kernels
(chroot) # cd /etc/kernels
(chroot) # wget http://ultrabug.fr/gentoo/xps9350/kernel-config-x86_64-4.4.4-gentoo
(chroot) # genkernel kernel

The proposed kernel config here is for gentoo-sources-4.4.4 so make sure to rename the file for your current version.

EDIT: if you need a newer kernel version you can also get my 4.5.1 kernel config here.

This kernel is far from perfect but it works very good so far, sound, webcam and suspend work smoothly !

make.conf settings for intel graphics

I can recommend using the following on your /etc/portage/make.conf :

INPUT_DRIVERS="evdev synaptics"
VIDEO_CARDS="intel i965"

fstab for SSD

Don’t forget to make sure the noatime option is used on your fstab for / and /home !

/dev/nvme0n1p1    /boot    vfat    noauto,noatime    1 2
/dev/nvme0n1p2    none     swap    sw                0 0
/dev/nvme0n1p3    /        ext4    noatime   0 1
/dev/nvme0n1p4    /home    ext4    noatime   0 1

As pointed out by stefantalpalaru on comments, it is recommended to schedule a SSD TRIM on your crontab once in a while, see Gentoo Wiki on SSD for more details.

encrypted /home auto-mounted at login

I advise adding the cryptsetup to your USE variable in /etc/portage/make.conf and then updating your @world with a emerge -NDuq @world.

I assume you don’t have created your user yet so your unmounted /home is empty. Make sure that :

  • your /dev/nvme0n1p4 home partition is not mounted
  • you removed the corresponding /home line from your /etc/fstab (we’ll configure pam_mount to get it auto-mounted on login)

AFAIK, the LUKS password you’ll set on the first slot when issuing luksFormat below should be the same as your user’s password !

(chroot) # cryptsetup luksFormat -s 512 /dev/nvme0n1p4
(chroot) # cryptsetup luksOpen /dev/nvme0n1p4 crypt_home
(chroot) # mkfs.ext4 /dev/mapper/crypt_home
(chroot) # mount /dev/mapper/crypt_home /home
(chroot) # useradd -m -G wheel,audio,video,plugdev,portage,users USERNAME
(chroot) # passwd USERNAME
(chroot) # umount /home
(chroot) # cryptsetup luksClose crypt_home

We’ll use sys-auth/pam_mount to manage the mounting of our /home partition when a user logs in successfully, so make sure you emerge pam_mount first, then configure the following files :

  • /etc/security/pam_mount.conf.xml (only line added is the volume one)
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE pam_mount SYSTEM "pam_mount.conf.xml.dtd">
<!--
	See pam_mount.conf(5) for a description.
-->

<pam_mount>

		<!-- debug should come before everything else,
		since this file is still processed in a single pass
		from top-to-bottom -->

<debug enable="0" />

		<!-- Volume definitions -->

<volume user="USERNAME" fstype="auto" path="/dev/nvme0n1p4" mountpoint="/home" options="fsck,noatime" />

		<!-- pam_mount parameters: General tunables -->

<!--
<luserconf name=".pam_mount.conf.xml" />
-->

<!-- Note that commenting out mntoptions will give you the defaults.
     You will need to explicitly initialize it with the empty string
     to reset the defaults to nothing. -->
<mntoptions allow="nosuid,nodev,loop,encryption,fsck,nonempty,allow_root,allow_other" />
<!--
<mntoptions deny="suid,dev" />
<mntoptions allow="*" />
<mntoptions deny="*" />
-->
<mntoptions require="nosuid,nodev" />

<!-- requires ofl from hxtools to be present -->
<logout wait="0" hup="no" term="no" kill="no" />


		<!-- pam_mount parameters: Volume-related -->

<mkmountpoint enable="1" remove="true" />


</pam_mount>
  • /etc/pam.d/system-auth (only lines added are the ones with pam_mount.so)
auth		required	pam_env.so 
auth		required	pam_unix.so try_first_pass likeauth nullok 
auth		optional	pam_mount.so
auth		optional	pam_permit.so

account		required	pam_unix.so 
account		optional	pam_permit.so

password	optional	pam_mount.so
password	required	pam_cracklib.so difok=2 minlen=8 dcredit=2 ocredit=2 retry=3 
password	required	pam_unix.so try_first_pass use_authtok nullok sha512 shadow 
password	optional	pam_permit.so

session		optional	pam_mount.so
session		required	pam_limits.so 
session		required	pam_env.so 
session		required	pam_unix.so 
session		optional	pam_permit.so

That’s it, easy heh ?! When you login as your user, pam_mount will decrypt your home partition using your user’s password and mount it on /home !

UEFI booting your Gentoo Linux

The best (and weird ?) way I found for booting the installed Gentoo Linux and its kernel is to configure the UEFI boot directly from the XPS BIOS.

The idea is that the BIOS can read the files from the EFI boot partition since it is formatted as FAT32. All we have to do is create a new boot option from the BIOS and configure it to use the kernel file stored in the EFI boot partition.

  • reboot your machine
  • get on the BIOS (hit F2)
  • get on the General / Boot Sequence menu
  • click Add
  • set a name (like Gentoo 4.3.3) and find + select the kernel file (use the integrated file finder)

IMG_20160102_113759

  • remove all unwanted boot options

IMG_20160102_113619

  • save it and reboot

Your Gentoo kernel and OpenRC will be booting now !

Suggestions, corrections, enhancements ?

As I said, I wrote all this quickly to spare some time to whoever it could help. I’m sure there are a lot of improvements to be done still so I’ll surely update this article later on.

uWSGI v2.0.12

It’s been a long time since I made a blog post about a uWSGI release but this one is special to me because it contains some features I asked for to a colleague of mine.

For his first contributions to a big Open Source project, our fellow @shir0kamii added two features (spooler_get_task and -if-hostname-match) which were backported in this release and that we needed at work for quite a long time : congratulations again 🙂

Highlights

  • official PHP 7 support
  • uwsgi.spooler_get_task API to easily read and inspect a spooler file from your code
  • -if-hostname-match regexp on uWSGI configuration files to allow a more flexible configuration based on the hostname of the machine

Of course, all of this is already available on Gentoo Linux !

Full changelog here as usual.

py3status v2.7

I’m more than two weeks late but I’m very glad to announce the release of py3status v2.7 which features a lot of interesting stuff !

For this release I want to salute the significant work and help of Daniel Foerster (@pydsigner), who discovered and fixed a bug in the event detection loop.

The result is a greatly improved click event detection and bar update speed with a largely reduced CPU consumption and less code !

Highlights

  • major performance and click event detection improvements by Daniel Foerster
  • support of %z on time and tztime modules fixes #110 and #123 thx to @derekdreery and @olhotak
  • directive %Z and any other failure in parsing the time and tztime modules format will result in using i3status date output
  • add ethernet, wireless and battery _first_ instance detection and support. thx to @rekoil for reporting on IRC
  • i3status.conf parser handles configuration values with the = char

New modules

  • new rt module: display ongoing tickets from RT queues
  • new xsel module: display xsel buffers, by umbsublime
  • new window_title_async module, by Anon1234

Modules enhancements

  • battery_level module: major improvements, documentation, add format option, by Maxim Baz
  • keyboard_layout module: color customisation, add format option, by Ali Mousavi
  • mpd_status module: fix connection leak, by Thomas Sanchez
  • pomodoro module: implement format option and add additional display features, by Christoph Schober
  • spotify module: fix support for new versions, by Jimmy Garpehäll
  • spotify module: add support for colors output based on the playback status, by Sondre Lefsaker
  • sysdata module: trim spaces in `cpu_temp`, by beetleman
  • whatismyip module: change default check URL and make it configurable

Thanks !

Once again, thanks to all contributors listed above !

uhashring : consistent hashing in python

It’s been quite some time since I wanted to use a consistent hashing based distribution of my workload in a quite big cluster at work. So when we finally reached the point where this became critical for some job processing I rushed to find out what python library I could use to implement this easily and efficiently.

I was surprised not to find a clear “winner” for such a library. The more “popular” named hash_ring has a rather unoptimized source code and is dead (old issues, no answer). Some others stepped up since but with no clear interest for contributions and almost no added features for real world applications.

So I packaged the ketama C library and its python binding on my overlay to get intimate with its algorithm. Then I started working on my own pure python library and released uhashring on Pypi and on Gentoo portage !

Features

  • instance-oriented usage so you can use your consistent hash ring object directly in your code (see advanced usage).
  • a lot of convenient methods to use your consistent hash ring in real world applications.
  • simple integration with other libs such as memcache through monkey patching.
  • all the missing functions in the libketama C python binding (which is not even available on pypi).
  • another and more performant consistent hash algorithm if you don’t care about the ketama compatibility (see benchmark).
  • native pypy support.
  • tests of implementation, key distribution and ketama compatibility.

WTF ?

Not so sure about hash tables and consistent hashing ? Read my consistent hashing 101 explanation attempt !

Consistent Hashing 101

When working on distributed systems, we often have to distribute some kind of workload on different machines (nodes) of a cluster so we have to rely on a predictable and reliable key/value mapping algorithm.

If you’re not sure about what that means, just consider the following questions when working on a cluster with a lot of machines (nodes):

  • how could I make sure that all the data for a given user always gets delivered and processed on the same machine ?
  • how could I make sure that I store and query the same cache server for a given key ?
  • how do I split and distribute chunks of a large file across multiple storage machines and then make sure I can still access it through all those machines at once ?

A lot of technologies answer those kind of mapping questions by implementing a hashing based distribution of their workload (be it a distributed processing, file or cache).

Consistent hashing is one of those implementations and I felt like taking some time to explain it further.

Example use case

Here is my attempt to explain what is consistent hashing and why it is needed on distributed systems. To make things fun, we’ll take this simple use case:

  • I have 5 broken cars
  • There are 4 repair factories nearby
  • I have to implement a way to figure out where to send each car to get it fixed
  • I want to ensure that an even number of cars will get fixed on each factory

hashing

This gets down to two major questions to solve:

  • what is my selection criteria ? this will be my key.
  • what is the expected answer ? this is my value.

Static mapping

The first approach we could implement is to manually distribute the car based on their colour.

  • key = car’s colour
  • value = factory number

To implement this, you use what we usually call dictionaries on various languages : those are static data structures where you assign a value to a key.

We would then write a mapping of “car color” -> “factory n” and apply this simple rule to decide where to ship a broken car.

{
  "yellow": "factory 1",
  "orange": "factory 2",
  "red": "factory 3",
  "green": "factory 4",
  "black": "factory 1"
}

This way we could indeed distribute the car repairs, but we can already see that with that an uneven number of colours ends up in over provisioning the factory 1. But there’s worse:

What if I start getting only yellow broken cars ?

  • I would end up sending all of them to the factory 1 and the other factories would remain almost empty !

hashing_static

This is a serious limitation. We need a dynamic way to calculate the car distribution between the factories, for this we will use a hash algorithm !

Hash tables

A hash table is a data structure where we apply a hash function (algorithm) on the key to compute an index (pointer) into an array of buckets (values) from which we get the value.

MD5 gives very good hashing distribution and is widely available so this makes it a very good candidate for a hashing algorithm.

We can relate it like this to our example :

  • key = car’s plate number
  • hash function = md5
  • array of values = [factory 1, factory 2, factory 3, factory 4]

To find out where we send a car we just could do:

hash = md5(car plate number)
index = int(hash) % size_of(array)
index = 0 if index > size_of(array)
factory = array[index]

In python ? okay !

import md5

factories = [1, 2, 3, 4]

def get_factory(plate):
    hash = int(md5.new(plate).hexdigest(), 16)
    index = hash % len(factories)
    if index > len(factories):
        index = 0
    return factories[index]

get_factory('ah-993-xx')
>> 3

get_factory('zz-6793-kh')
>> 3

Wow it’s amazingly simple right ? 🙂

Now we have a way better car repair distribution !… until something bad happens:

What if a factory burns ?

hashing_fire

Our algorithm is based on the number of available factories so removing a factory from our array means that we will redistribute a vast majority of the key mappings from our hash table !

Keep in mind that the more values (factories) you have in your array the worse this problem gets. In our case, given a car’s plate number we are sure that we wouldn’t be able to figure out where a vast majority of them were sent any more.

factories = [1, 2, 4]

get_factory('ah-993-xx')
>> 2 (was 3 before)

get_factory('zz-6793-kh')
>> 1 (was 3 before)

Even worse is that when factory 3 gets repaired and back in my hash table, I will once again loose track of all my dispatched cars… What we need is a more consistent way of sorting this out.

Consistent hashing

The response to this kind of problem is to implement a consistent hashing algorithm. The goal of this technique is to limit the number of remapped keys when the hash table is resized.

This is possible by imagining our factories as a circle (ring) and the hash of our keys as points on the same circle. We would then select the next factory (value) by going through the circle, always on the same way, until we find a factory.

hashing_ring

  • Red hashed plate number would go to factory 1
  • Blue hashed plate number would go to factory 3

This way, when a factory gets added or removed from the ring, we loose only a limited portion of the key/value mappings !

Of course on a real world example we would implement a ring with a lot more of slots by adding the same factories on the ring multiple times. This way the affected range of mappings would be smaller and the impact even more balanced !

For instance, uhashring being fully compatible and defaulting to ketama’s ring distribution, you get 160 points per node on the ring !

I hope I got this little example right and gave you some insight on this very interesting topic !

MongoDB 3.0 upgrade in production : step 4 victory !

In my last post, I explained the new hope we had in following some newly added recommended steps before trying to migrate our production cluster to mongoDB 3.0 WiredTiger.

The most demanding step was migrating all our production servers data storage filesystems to XFS which obviously required a resync of each node… But we ended up being there pretty fast and were about to try again as 3.0.5 was getting ready, until we saw this bug coming !

I guess you can understand why we decided to wait for 3.0.6… which eventually got released with a more peaceful changelog this time.

The 3.0.6 crash test

We decided as usual to test the migration to WiredTiger in two phases.

  1. Migrate all our secondaries to the WiredTiger engine (full resync). Wait a week to see if this has any effect on our cluster.
  2. Switch all the MMapv1 primary nodes to secondary and let our WiredTiger secondary nodes become the primary nodes of our cluster. Pray hard that this time it will not break under our workload.

Step 1 results were good, nothing major changed and even our mongo dumps were still working this time (yay!). One week later, everything was still working smoothly.

Step 2 was the big challenge which failed horribly last time. Needless to say that we were quite stressed when doing the switch. But it worked smoothly and nothing broke + performances gains were huge !

The results

Nothing speaks better than metrics, so I’ll just comment them quickly as they speak by themselves. I obviously can’t disclose the scales sorry.

Insert-only operations gained 25x performance

2015-09-14-173545_531x276_scrot

 

 

 

 

 

 

 

Upsert-heavy operations gained 5x performance

2015-09-14-173748_530x275_scrot

 

 

 

 

 

 

 

Disk I/O also showed mercy to the disk overall usage. This is due to WiredTiger superior caching and disk flushing mechanisms.

2015-09-14-173954_409x254_scrot

 

 

 

 

 

 

Disk usage decreased dramatically thanks to WiredTiger compression

2015-09-14-174137_317x180_scrot

 

 

 

 

The last and next step

As of today, we still run our secondaries with the MMapv1 engine and are waiting a few weeks to see if anything goes wrong in the long run. Shall we need to roll back, we’d be able to do so very easily.

Then when we get enough uptime using WiredTiger, we will make the final switch to a full Roarring production cluster !

py3status v2.6

Ok I was a bit too hasty in my legacy module support code clean up and I broke quite a few things on the latest version 2.5 release sorry ! 🙁

highlights

  • make use of pkgutil to detect properly installed modules even when they are zipped in egg files (manual install)
  • add back legacy modules output support (tuple of position / response)
  • new uname module inspired from issue 117 thanks to @ndalliard
  • remove dead code

thanks !

  • @coelebs on IRC for reporting, testing and the good spirit 🙂
  • @ndalliard on github for the issue, debug and for inspiring the uname module
  • @Horgix for responding to issues faster than me !