Tag Archives: gentoo

Evaluating ScyllaDB for production 2/2

In my previous blog post, I shared 7 lessons on our experience in evaluating Scylla for production.

Those lessons were focused on the setup and execution of the POC and I promised a more technical blog post with technical details and lessons learned from the POC, here it is!

Before you read on, be mindful that our POC was set up to test workloads and workflows, not to benchmark technologies. So even if the Scylla figures are great, they have not been the main drivers of the actual conclusion of the POC.

Business context

As a data driven company working in the Marketing and Advertising industry, we help our clients make sense of multiple sources of data to build and improve their relationship with their customers and prospects.

Dealing with multiple sources of data is nothing new but their volume has dramatically changed during the past decade. I will spare you with the Big-Data-means-nothing term and the technical challenges that comes with it as you already heard enough of it.

Still, it is clear that our line of business is tied to our capacity at mixing and correlating a massive amount of different types of events (data sources/types) coming from various sources which all have their own identifiers (think primary keys):

  • Web navigation tracking: identifier is a cookie that’s tied to the tracking domain (we have our own)
  • CRM databases: usually the email address or an internal account ID serve as an identifier
  • Partners’ digital platform: identifier is also a cookie tied to their tracking domain

To try to make things simple, let’s take a concrete example:

You work for UNICEF and want to optimize their banner ads budget by targeting the donors of their last fundraising campaign.

  • Your reference user database is composed of the donors who registered with their email address on the last campaign: main identifier is the email address.
  • To buy web display ads, you use an Ad Exchange partner such as AppNexus or DoubleClick (Google). From their point of view, users are seen as cookie IDs which are tied to their own domain.

So you basically need to be able to translate an email address to a cookie ID for every partner you work with.

Use case: ID matching tables

We operate and maintain huge ID matching tables for every partner and a great deal of our time is spent translating those IDs from one to another. In SQL terms, we are basically doing JOINs between a dataset and those ID matching tables.

  • You select your reference population
  • You JOIN it with the corresponding ID matching table
  • You get a matched population that your partner can recognize and interact with

Those ID matching tables have a pretty high read AND write throughput because they’re updated and queried all the time.

Usual figures are JOINs between a 10+ Million dataset and 1.5+ Billion ID matching tables.

The reference query basically looks like this:

SELECT count(m.partnerid)
FROM population_10M_rows AS p JOIN partner_id_match_400M_rows AS m
ON p.id = m.id

 Current implementations

We operate a lambda architecture where we handle real time ID matching using MongoDB and batch ones using Hive (Apache Hadoop).

The first downside to note is that it requires us to maintain two copies of every ID matching table. We also couldn’t choose one over the other because neither MongoDB nor Hive can sustain both the read/write lookup/update ratio while performing within the low latencies that we need.

This is an operational burden and requires quite a bunch of engineering to ensure data consistency between different data stores.

Production hardware overview:

  • MongoDB is running on a 15 nodes (5 shards) cluster
    • 64GB RAM, 2 sockets, RAID10 SAS spinning disks, 10Gbps dual NIC
  • Hive is running on 50+ YARN NodeManager instances
    • 128GB RAM, 2 sockets, JBOD SAS spinning disks, 10Gbps dual NIC

Target implementation

The key question is simple: is there a technology out there that can sustain our ID matching tables workloads while maintaining consistently low upsert/write and lookup/read latencies?

Having one technology to handle both use cases would allow:

  • Simpler data consistency
  • Operational simplicity and efficiency
  • Reduced costs

POC hardware overview:

So we decided to find out if Scylla could be that technology. For this, we used three decommissioned machines that we had in the basement of our Paris office.

  • 2 DELL R510
    • 19GB RAM, 2 socket 8 cores, RAID0 SAS spinning disks, 1Gbps NIC
  • 1 DELL R710
    • 19GB RAM, 2 socket 4 cores, RAID0 SAS spinning disks, 1Gbps NIC

I know, these are not glamorous machines and they are even inconsistent in specs, but we still set up a 3 node Scylla cluster running Gentoo Linux with them.

Our take? If those three lousy machines can challenge or beat the production machines on our current workloads, then Scylla can seriously be considered for production.

Step 1: Validate a schema model

Once the POC document was complete and the ScyllaDB team understood what we were trying to do, we started iterating on the schema model using a query based modeling strategy.

So we wrote down and rated the questions that our model(s) should answer to, they included stuff like:

  • What are all our cookie IDs associated to the given partner ID ?
  • What are all the cookie IDs associated to the given partner ID over the last N months ?
  • What is the last cookie ID/date for the given partner ID ?
  • What is the last date we have seen the given cookie ID / partner ID couple ?

As you can imagine, the reverse questions are also to be answered so ID translations can be done both ways (ouch!).


This is no news that I’m a Python addict so I did all my prototyping using Python and the cassandra-driver.

I ended up using a test-driven data modelling strategy using pytest. I wrote tests on my dataset so I could concentrate on the model while making sure that all my questions were being answered correctly and consistently.


In our case, we ended up with three denormalized tables to answer all the questions we had. To answer the first three questions above, you could use the schema below:

CREATE TABLE IF NOT EXISTS ids_by_partnerid(
 partnerid text,
 id text,
 date timestamp,
 PRIMARY KEY ((partnerid), date, id)

Note on clustering key ordering

One important learning I got in the process of validating the model is about the internals of Cassandra’s file format that resulted in the choice of using a descending order DESC on the date clustering key as you can see above.

If your main use case of querying is to look for the latest value of an history-like table design like ours, then make sure to change the default ASC order of your clustering key to DESC. This will ensure that the latest values (rows) are stored at the beginning of the sstable file effectively reducing the read latency when the row is not in cache!

Let me quote Glauber Costa’s detailed explanation on this:

Basically in Cassandra’s file format, the index points to an entire partition (for very large partitions there is a hack to avoid that, but the logic is mostly the same). So if you want to read the first row, that’s easy you get the index to the partition and read the first row. If you want to read the last row, then you get the index to the partition and do a linear scan to the next.

This is the kind of learning you can only get from experts like Glauber and that can justify the whole POC on its own!

Step 2: Set up scylla-grafana-monitoring

As I said before, make sure to set up and run the scylla-grafana-monitoring project before running your test workloads. This easy to run solution will be of great help to understand the performance of your cluster and to tune your workload for optimal performances.

If you can, also discuss with the ScyllaDB team to allow them to access the Grafana dashboard. This will be very valuable since they know where to look better than we usually do… I gained a lot of understandings thanks to this!

Note on scrape interval

I advise you to lower the Prometheus scrape interval to have a shorter and finer sampling of your metrics. This will allow your dashboard to be more reactive when you start your test workloads.

For this, change the prometheus/prometheus.yml file like this:

scrape_interval: 2s # Scrape targets every 2 seconds (5s default)
scrape_timeout: 1s # Timeout before trying to scrape a target again (4s default)

Test your monitoring

Before going any further, I strongly advise you to run a stress test on your POC cluster using the cassandra-stress tool and share the results and their monitoring graphs with the ScyllaDB team.

This will give you a common understanding of the general performances of your cluster as well as help in outlining any obvious misconfiguration or hardware problem.

Key graphs to look at

There are a lot of interesting graphs so I’d like to share the ones that I have been mainly looking at. Remember that depending on your test workloads, some other graphs may be more relevant for you.

  • number of open connections

You’ll want to see a steady and high enough number of open connections which will prove that your clients are pushed at their maximum (at the time of testing this graph was not on Grafana and you had to add it yourself)

  • cache hits / misses

Depending on your reference dataset, you’ll obviously see that cache hits and misses will have a direct correlation with disk I/O and overall performances. Running your test workloads multiple times should trigger higher cache hits if your RAM is big enough.

  • per shard/node distribution

The Requests Served per shard graph should display a nicely distributed load between your shards and nodes so that you’re sure that you’re getting the best out of your cluster.

The same is true for almost every other “per shard/node” graph: you’re looking for evenly distributed load.

  • sstable reads

Directly linked with your disk performances, you’ll be trying to make sure that you have almost no queued sstable reads.

Step 3: Get your reference data and metrics

We obviously need to have some reference metrics on our current production stack so we can compare them with the results on our POC Scylla cluster.

Whether you choose to use your current production machines or set up a similar stack on the side to run your test workloads is up to you. We chose to run the vast majority of our tests on our current production machines to be as close to our real workloads as possible.

Prepare a reference dataset

During your work on the POC document, you should have detailed the usual data cardinality and volume you work with. Use this information to set up a reference dataset that you can use on all of the platforms that you plan to compare.

In our case, we chose a 10 Million reference dataset that we JOINed with a 400+ Million extract of an ID matching table. Those volumes seemed easy enough to work with and allowed some nice ratio for memory bound workloads.

Measure on your current stack

Then it’s time to load this reference datasets on your current platforms.

  • If you run a MongoDB cluster like we do, make sure to shard and index the dataset just like you do on the production collections.
  • On Hive, make sure to respect the storage file format of your current implementations as well as their partitioning.

If you chose to run your test workloads on your production machines, make sure to run them multiple times and at different hours of the day and night so you can correlate the measures with the load on the cluster at the time of the tests.

Reference metrics

For the sake of simplicity I’ll focus on the Hive-only batch workloads. I performed a count on the JOIN of the dataset and the ID matching table using Spark 2 and then I also ran the JOIN using a simple Hive query through Beeline.

I gave the following definitions on the reference load:

  • IDLE: YARN available containers and free resources are optimal, parallelism is very limited
  • NORMAL: YARN sustains some casual load, parallelism exists but we are not bound by anything still
  • HIGH: YARN has pending containers, parallelism is high and applications have to wait for containers before executing

There’s always an error margin on the results you get and I found that there was not significant enough differences between the results using Spark 2 and Beeline so I stuck with a simple set of results:

  • IDLE: 2 minutes, 15 seconds
  • NORMAL: 4 minutes
  • HIGH: 15 minutes

Step 4: Get Scylla in the mix

It’s finally time to do your best to break Scylla or at least to push it to its limits on your hardware… But most importantly, you’ll be looking to understand what those limits are depending on your test workloads as well as outlining out all the required tuning that you will be required to do on the client side to reach those limits.

Speaking about the results, we will have to differentiate two cases:

  1. The Scylla cluster is fresh and its cache is empty (cold start): performance is mostly Disk I/O bound
  2. The Scylla cluster has been running some test workload already and its cache is hot: performance is mostly Memory bound with some Disk I/O depending on the size of your RAM

Spark 2 / Scala test workload

Here I used Scala (yes, I did) and DataStax’s spark-cassandra-connector so I could use the magic joinWithCassandraTable function.

  • spark-cassandra-connector-2.0.1-s_2.11.jar
  • Java 7

I had to stick with the 2.0.1 version of the spark-cassandra-connector because newer version (2.0.5 at the time of testing) were performing bad with no apparent reason. The ScyllaDB team couldn’t help on this.

You can interact with your test workload using the spark2-shell:

spark2-shell --jars jars/commons-beanutils_commons-beanutils-1.9.3.jar,jars/com.twitter_jsr166e-1.1.0.jar,jars/io.netty_netty-all-4.0.33.Final.jar,jars/org.joda_joda-convert-1.2.jar,jars/commons-collections_commons-collections-3.2.2.jar,jars/joda-time_joda-time-2.3.jar,jars/org.scala-lang_scala-reflect-2.11.8.jar,jars/spark-cassandra-connector-2.0.1-s_2.11.jar

Then use the following Scala imports:

// main connector import
import com.datastax.spark.connector._

// the joinWithCassandraTable failed without this (dunno why, I'm no Scala guy)
import com.datastax.spark.connector.writer._
implicit val rowWriter = SqlRowWriter.Factory

Finally I could run my test workload to select the data from Hive and JOIN it with Scylla easily:

val df_population = spark.sql("SELECT id FROM population_10M_rows")
val join_rdd = df_population.rdd.repartitionByCassandraReplica("test_keyspace", "partner_id_match_400M_rows").joinWithCassandraTable("test_keyspace", "partner_id_match_400M_rows")
val joined_count = join_rdd.count()

Notes on tuning spark-cassandra-connector

I experienced pretty crappy performances at first. Thanks to the easy Grafana monitoring, I could see that Scylla was not being the bottleneck at all and that I instead had trouble getting some real load on it.

So I engaged in a thorough tuning of the spark-cassandra-connector with the help of Glauber… and it was pretty painful but we finally made it and got the best parameters to get the load on the Scylla cluster close to 100% when running the test workloads.

This tuning was done in the spark-defaults.conf file:

  • have a fixed set of executors and boost their overhead memory

This will increase test results reliability by making sure you always have a reserved number of available workers at your disposal.

  • set the split size to 1MB

Default is 8MB but Scylla uses a split size of 1MB so you’ll see a great boost of performance and stability by setting this setting to the right number.

  • align driver timeouts with server timeouts

It is advised to make sure that your read request timeouts are the same on the driver and the server so you do not get stalled states waiting for a timeout to happen on one hand. You can do the same with write timeouts if your test workloads are write intensive.


read_request_timeout_in_ms: 150000



// optional if you want to fail / retry faster for HA scenarios
  • adjust your reads per second rate

Last but surely not least, this setting you will need to try and find out the best value for yourself since it has a direct impact on the load on your Scylla cluster. You will be looking at pushing your POC cluster to almost 100% load.


As I said before, I could only get this to work perfectly using the 2.0.1 version of the spark-cassandra-connector driver. But then it worked very well and with great speed.

Spark 2 results

Once tuned, the best results I was able to reach on this hardware are listed below. It’s interesting to see that with spinning disks, the cold start result can compete with the results of a heavily loaded Hadoop cluster where pending containers and parallelism are knocking down its performances.

  • hot cache: 2min
  • cold cache: 12min

Wow! Those three refurbished machines can compete with our current production machines and implementations, they can even match an idle Hive cluster of a medium size!

Python test workload

I couldn’t conclude on a Scala/Spark 2 only test workload. So I obviously went back to my language of choice Python only to discover at my disappointment that there is no joinWithCassandraTable equivalent available on pyspark

I tried with some projects claiming otherwise with no success until I changed my mind and decided that I probably didn’t need Spark 2 at all. So I went into the crazy quest of beating Spark 2 performances using a pure Python implementation.

This basically means that instead of having a JOIN like helper, I had to do a massive amount of single “id -> partnerid” lookups. Simple but greatly inefficient you say? Really?

When I broke down the pieces, I was left with the following steps to implement and optimize:

  • Load the 10M rows worth of population data from Hive
  • For every row, lookup the corresponding partnerid in the ID matching table from Scylla
  • Count the resulting number of matches

The main problem to compete with Spark 2 is that it is a distributed framework and Python by itself is not. So you can’t possibly imagine outperforming Spark 2 with your single machine.

However, let’s remember that Spark 2 is shipped and ran on executors using YARN so we are firing up JVMs and dispatching containers all the time. This is a quite expensive process that we have a chance to avoid using Python!

So what I needed was a distributed computation framework that would allow to load data in a partitioned way and run the lookups on all the partitions in parallel before merging the results. In Python, this framework exists and is named Dask!

You will obviously need to have to deploy a dask topology (that’s easy and well documented) to have a comparable number of dask workers than of Spark 2 executors (30 in my case) .

The corresponding Python code samples are here.

Hive + Scylla results

Reading the population id’s from Hive, the workload can be split and executed concurrently on multiple dask workers.

  • read the 10M population rows from Hive in a partitioned manner
  • for each partition (slice of 10M), query Scylla to lookup the possibly matching partnerid
  • create a dataframe from the resulting matches
  • gather back all the dataframes and merge them
  • count the number of matches

The results showed that it is possible to compete with Spark 2 with Dask:

  • hot cache: 2min (rounded up)
  • cold cache: 6min

Interestingly, those almost two minutes can be broken down like this:

  • distributed read data from Hive: 50s
  • distributed lookup from Scylla: 60s
  • merge + count: 10s

This meant that if I could cut down the reading of data from Hive I could go even faster!

Parquet + Scylla results

Going further on my previous remark I decided to get rid of Hive and put the 10M rows population data in a parquet file instead. I ended up trying to find out the most efficient way to read and load a parquet file from HDFS.

My conclusion so far is that you can’t be the amazing libhdfs3 + pyarrow combo. It is faster to load everything on a single machine than loading from Hive on multiple ones!

The results showed that I could almost get rid of a whole minute in the total process, effectively and easily beating Spark 2!

  • hot cache: 1min 5s
  • cold cache: 5min

Notes on the Python cassandra-driver

Tests using Python showed robust queries experiencing far less failures than the spark-cassandra-connector, even more during the cold start scenario.

  • The usage of execute_concurrent() provides a clean and linear interface to submit a large number of queries while providing some level of concurrency control
  • Increasing the concurrency parameter from 100 to 512 provided additional throughput, but increasing it more looked useless
  • Protocol version 4 forbids the tuning of connection requests / number to some sort of auto configuration. All tentative to hand tune it (by lowering protocol version to 2) failed to achieve higher throughput
  • Installation of libev on the system allows the cassandra-driver to use it to handle concurrency instead of asyncore with a somewhat lower load footprint on the worker node but no noticeable change on the throughput
  • When reading a parquet file stored on HDFS, the hdfs3 + pyarrow combo provides an insane speed (less than 10s to fully load 10M rows of a single column)

Step 5: Play with High Availability

I was quite disappointed and surprised by the lack of maturity of the Cassandra community on this critical topic. Maybe the main reason is that the cassandra-driver allows for too many levels of configuration and strategies.

I wrote this simple bash script to allow me to simulate node failures. Then I could play with handling those failures and retries on the Python client code.


iptables -t filter -X
iptables -t filter -F

for port in 9042 9160 9180 10000 7000; do
	iptables -t filter -A INPUT -p tcp --dport ${port} -s ${ip} -j DROP
	iptables -t filter -A OUTPUT -p tcp --sport ${port} -d ${ip} -j DROP

while true; do
	trap break INT
	iptables -t filter -vnL
	sleep 1

iptables -t filter -X
iptables -t filter -F
iptables -t filter -vnL

This topic is worth going in more details on a dedicated blog post that I shall write later on while providing code samples.

Concluding the evaluation

I’m happy to say that Scylla passed our production evaluation and will soon go live on our infrastructure!

As I said at the beginning of this post, the conclusion of the evaluation has not been driven by the good figures we got out of our test workloads. Those are no benchmarks and never pretended to be but we could still prove that performances were solid enough to not be a blocker in the adoption of Scylla.

Instead we decided on the following points of interest (in no particular order):

  • data consistency
  • production reliability
  • datacenter awareness
  • ease of operation
  • infrastructure rationalisation
  • developer friendliness
  • costs

On the side, I tried Scylla on two other different use cases which proved interesting to follow later on to displace MongoDB again…

Moving to production

Since our relationship was great we also decided to partner with ScyllaDB and support them by subscribing to their enterprise offerings. They also accepted to support us using Gentoo Linux!

We are starting with a three nodes heavy duty cluster:

  • DELL R640
    • dual socket 2,6GHz 14C, 512GB RAM, Samsung 17xxx NVMe 3,2 TB

I’m eager to see ScyllaDB building up and will continue to help with my modest contributions. Thanks again to the ScyllaDB team for their patience and support during the POC!

Evaluating ScyllaDB for production 1/2

I have recently been conducting a quite deep evaluation of ScyllaDB to find out if we could benefit from this database in some of our intensive and latency critical data streams and jobs.

I’ll try to share this great experience within two posts:

  1. The first one (you’re reading) will walk through how to prepare yourself for a successful Proof Of Concept based evaluation with the help of the ScyllaDB team.
  2. The second post will cover the technical aspects and details of the POC I’ve conducted with the various approaches I’ve followed to find the most optimal solution.

But let’s start with how I got into this in the first place…

Selecting ScyllaDB

I got interested in ScyllaDB because of its philosophy and engagement and I quickly got into it by being a modest contributor and its Gentoo Linux packager (not in portage yet).

Of course, I didn’t pick an interest in that technology by chance:

We’ve been using MongoDB in (mass) production at work for a very very long time now. I can easily say we were early MongoDB adopters. But there’s no wisdom in saying that MongoDB is not suited for every use case and the Hadoop stack has come very strong in our data centers since then, with a predominance of Hive for the heavy duty and data hungry workflows.

One thing I was never satisfied with MongoDB was its primary/secondary architecture which makes you lose write throughput and is even more horrible when you want to set up what they call a “cluster” which is in fact some mediocre abstraction they add on top of replica-sets. To say the least, it is inefficient and cumbersome to operate and maintain.

So I obviously had Cassandra on my radar for a long time, but I was pushed back by its Java stack, heap size and silly tuning… Also, coming from the versatile MongoDB world, Cassandra’s CQL limitations looked dreadful at that time…

The day I found myself on ScyllaDB’s webpage and read their promises, I was sure to be challenging our current use cases with this interesting sea monster.

Setting up a POC with the people at ScyllaDB

Through my contributions around my packaging of ScyllaDB for Gentoo Linux, I got to know a bit about the people behind the technology. They got interested in why I was packaging this in the first place and when I explained my not-so-secret goal of challenging our production data workflows using Scylla, they told me that they would love to help!

I was a bit surprised at first because this was the first time I ever saw a real engagement of the people behind a technology into someone else’s POC.

Their pitch is simple, they will help (for free) anyone conducting a serious POC to make sure that the outcome and the comprehension behind it is the best possible. It is a very mature reasoning to me because it is easy to make false assumptions and conclude badly when testing a technology you don’t know, even more when your use cases are complex and your expectations are very high like us.

Still, to my current knowledge, they’re the only ones in the data industry to have this kind of logic in place since the start. So I wanted to take this chance to thank them again for this!

The POC includes:

  • no bullshit, simple tech-to-tech relationship
  • a private slack channel with multiple ScyllaDB’s engineers
  • video calls to introduce ourselves and discuss our progress later on
  • help in schema design and logic
  • fast answers to every question you have
  • detailed explanations on the internals of the technology
  • hardware sizing help and validation
  • funny comments and French jokes (ok, not suitable for everyone)










Lessons for a successful POC

As I said before, you’ve got to be serious in your approach to make sure your POC will be efficient and will lead to an unbiased and fair conclusion.

This is a list of the main things I consider important to have prepared before you start.

Have some background

Make sure to read some literature to have the key concepts and words in mind before you go. It is even more important if like me you do not come from the Cassandra world.

I found that the Cassandra: The Definitive Guide book at O’Reilly is a great read. Also, make sure to go around ScyllaDB’s documentation.

Work with a shared reference document

Make sure you share with the ScyllaDB guys a clear and detailed document explaining exactly what you’re trying to achieve and how you are doing it today (if you plan on migrating like we did).

I made a google document for this because it felt the easiest. This document will be updated as you go and will serve as a reference for everyone participating in the POC.

This shared reference document is very important, so if you don’t know how to construct it or what to put in it, here is how I structured it:

  1. Who’s participating at <your company>
    • photo + name + speciality
  2. Who’s participating at ScyllaDB
  3. POC hardware
    • if you have your own bare metal machines you want to run your POC on, give every detail about their number and specs
    • if not, explain how you plan to setup and run your scylla cluster
  4. Reference infrastructure
    • give every details on the technologies and on the hardware of the servers that are currently responsible for running your workflows
    • explain your clusters and their speciality
  5. Use case #1 : <name>
    • Context
      • give context about your use case by explaining it without tech words, think from the business / user point of view
    • Current implementations
      • that’s where you get technical
      • technology names and where they come into play in your current stack
      • insightful data volumes and cardinality
      • current schema models
    • Workload related to this use case
      • queries per second per data source / type
      • peek hours or no peek hours?
      • criticality
    • Questions we want to answer to
      • remember, the NoSQL world is lead by query-based-modeling schema design logic, cassandra/scylla is no exception
      • write down the real questions you want your data model(s) to be able to answer to
      • group them and rate them by importance
    • Validated models
      • this one comes during the POC when you have settled on the data models
      • write them down, explain them or relate them to the questions they answer to
      • copy/paste some code showcasing how to work with them
    • Code examples
      • depending on the complexity of your use case, you may have multiple constraints or ways to compare your current implementation with your POC
      • try to explain what you test and copy/paste the best code you came up with to validate each point

Have monitoring in place

ScyllaDB provides a monitoring platform based on Docker, Prometheus and Grafana that is efficient and simple to set up. I strongly recommend that you set it up, as it provides valuable insights almost immediately, and on an ongoing basis.

Also you should strive to give access to your monitoring to the ScyllaDB guys, if that’s possible for you. They will provide with a fixed IP which you can authorize to access your grafana dashboards so they can have a look at the performances of your POC cluster as you go. You’ll learn a great deal about ScyllaDB’s internals by sharing with them.

Know when to stop

The main trap in a POC is to work without boundaries. Since you’re looking for the best of what you can get out of a technology, you’ll get tempted to refine indefinitely.

So this is good to have at least an idea on the minimal figures you’d like to reach to get satisfied with your tests. You can always push a bit further but not for too long!

Plan some high availability tests

Even if you first came to ScyllaDB for its speed, make sure to test its high availability capabilities based on your experience.

Most importantly, make sure you test it within your code base and guidelines. How will your code react and handle a failure, partial and total? I was very surprised and saddened to discover so little literature on the subject in the Cassandra community.

POC != production

Remember that even when everything is right on paper, production load will have its share of surprises and unexpected behaviours. So keep a good deal of flexibility in your design and your capacity planning to absorb them.

Make time

Our POC lasted almost 5 months instead of estimated 3, mostly because of my agenda’s unwillingness to cooperate…

As you can imagine this interruption was not always optimal, for either me or the ScyllaDB guys, but they were kind not to complain about it. So depending on how thorough you plan to be, make sure you make time matching your degree of demands. The reference document is also helpful to get back to speed.

Feedback for the ScyllaDB guys

Here are the main points I noted during the POC that the guys from ScyllaDB could improve on.

They are subjective of course but it’s important to give feedback so here it goes. I’m fully aware that everyone is trying to improve, so I’m not pointing any fingers at all.

I shared those comments already with them and they acknowledged them very well.

More video meetings on start

When starting the POC, try to have some pre-scheduled video meetings to set it right in motion. This will provide a good pace as well as making sure that everyone is on the same page.

Make a POC kick starter questionnaire

Having a minimal plan to follow with some key points to set up just like the ones I explained before would help. Maybe also a minimal questionnaire to make sure that the key aspects and figures have been given some thought since the start. This will raise awareness on the real answers the POC aims to answer.

To put it simpler: some minimal formalism helps to check out the key aspects and questions.

Develop a higher client driver expertise

This one was the most painful to me, and is likely to be painful for anyone who, like me, is not coming from the Cassandra world.

Finding good and strong code examples and guidelines on the client side was hard and that’s where I felt the most alone. This was not pleasant because a technology is definitely validated through its usage which means on the client side.

Most of my tests were using python and the python-cassandra driver so I had tons of questions about it with no sticking answers. Same thing went with the spark-cassandra-connector when using scala where some key configuration options (not documented) can change the shape of your results drastically (more details on the next post).

High Availability guidelines and examples

This one still strikes me as the most awkward on the Cassandra community. I literally struggled with finding clear and detailed explanations about how to handle failure more or less gracefully with the python driver (or any other driver).

This is kind of a disappointment to me for a technology that position itself as highly available… I’ll get into more details about it on the next post.

A clearer sizing documentation

Even if there will never be a magic formula, there are some rules of thumb that exist for sizing your hardware for ScyllaDB. They should be written down more clearly in a maybe dedicated documentation (sizing guide is labeled as admin guide at time of writing).

Some examples:

  • RAM per core ? what is a core ? relation to shard ?
  • Disk / RAM maximal ratio ?
  • Multiple SSDs vs one NMVe ?
  • Hardware RAID vs software RAID ? need a RAID controller at all ?

Maybe even provide a bare metal complete example from two different vendors such as DELL and HP.

What’s next?

In the next post, I’ll get into more details on the POC itself and the technical learnings we found along the way. This will lead to the final conclusion and the next move we engaged ourselves with.

py3status v3.7

This important release has been long awaited as it focused on improving overall performance of py3status as well as dramatically decreasing its memory footprint!

I want once again to salute the impressive work of @lasers, our amazing contributors from the USA who has become top one contributor of 2017 in term of commits and PRs.

Thanks to him, this release brings a whole batch of improvements and QA clean ups on various modules. I encourage you to go through the changelog to see everything.


Deep rework of the usage and scheduling of threads to run modules has been done by @tobes.

  •  now py3status does not keep one thread per module running permanently but instead uses a queue to spawn a thread to execute the module only when its cache expires
  • this new scheduling and usage of threads allows py3status to run under asynchronous event loops and gevent will be supported on the upcoming 3.8
  • memory footprint of py3status got largely reduced thanks to the threads modifications and thanks to a nice hunt on ever growing and useless variables
  • modules error reporting is now more detailed

Milestone 3.8

The next release will bring some awesome new features such as gevent support, environment variable support in config file and per module persistent data storage as well as new modules!

Thanks contributors!

This release is their work, thanks a lot guys!

  • JohnAZoidberg
  • lasers
  • maximbaz
  • pcewing
  • tobes

Gentoo Linux on DELL XPS 13 9365 and 9360

Since I received some positive feedback about my previous DELL XPS 9350 post, I am writing this summary about my recent experience in installing Gentoo Linux on a DELL XPS 13 9365.

This installation notes goals:

  • UEFI boot using Grub
  • Provide you with a complete and working kernel configuration
  • Encrypted disk root partition using LUKS
  • Be able to type your LUKS passphrase to decrypt your partition using your local keyboard layout

Grub & UEFI & Luks installation

This installation is a fully UEFI one using grub and booting an encrypted root partition (and home partition). I was happy to see that since my previous post, everything got smoother. So even if you can have this installation working using MBR, I don’t really see a point in avoiding UEFI now.

BIOS configuration

Just like with its ancestor, you should:

  • Turn off Secure Boot
  • Set SATA Operation to AHCI

Live CD

Once again, go for the latest SystemRescueCD (it’s Gentoo based, you won’t be lost) as it’s quite more up to date and supports booting on UEFI. Make it a Live USB for example using unetbootin and the ISO on a vfat formatted USB stick.

NVME SSD disk partitioning

We’ll obviously use GPT with UEFI. I found that using gdisk was the easiest. The disk itself is found on /dev/nvme0n1. Here it is the partition table I used :

  • 10Mo UEFI BIOS partition (type EF02)
  • 500Mo UEFI boot partition (type EF00)
  • 2Go Swap partition
  • 475Go Linux root partition

The corresponding gdisk commands :

# gdisk /dev/nvme0n1

Command: o ↵
This option deletes all partitions and creates a new protective MBR.
Proceed? (Y/N): y ↵

Command: n ↵
Partition Number: 1 ↵
First sector: ↵
Last sector: +10M ↵
Hex Code: EF02 ↵

Command: n ↵
Partition Number: 2 ↵
First sector: ↵
Last sector: +500M ↵
Hex Code: EF00 ↵

Command: n ↵
Partition Number: 3 ↵
First sector: ↵
Last sector: +2G ↵
Hex Code: 8200 ↵

Command: n ↵
Partition Number: 4 ↵
First sector: ↵
Last sector: ↵ (for rest of disk)
Hex Code: ↵

Command: p ↵
Disk /dev/nvme0n1: 1000215216 sectors, 476.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): A73970B7-FF37-4BA7-92BE-2EADE6DDB66E
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1000215182
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048           22527   10.0 MiB    EF02  BIOS boot partition
   2           22528         1046527   500.0 MiB   EF00  EFI System
   3         1046528         5240831   2.0 GiB     8200  Linux swap
   4         5240832      1000215182   474.4 GiB   8300  Linux filesystem

Command: w ↵
Do you want to proceed? (Y/N): Y ↵

No WiFi on Live CD ? no panic

Once again on my (old?) SystemRescueCD stick, the integrated Intel 8265/8275 wifi card is not detected.

So I used my old trick with my Android phone connected to my local WiFi as a USB modem which was detected directly by the live CD.

  • get your Android phone connected on your local WiFi (unless you want to use your cellular data)
  • plug in your phone using USB to your XPS
  • on your phone, go to Settings / More / Tethering & portable hotspot
  • enable USB tethering

Running ip addr will show the network card enp0s20f0u2 (for me at least), then if no IP address is set on the card, just ask for one :

# dhcpcd enp0s20f0u2

You have now access to the internet.

Proceed with installation

The only thing to prepare is to format the UEFI boot partition as FAT32. Do not worry about the UEFI BIOS partition (/dev/nvme0n1p1), grub will take care of it later.

# mkfs.vfat -F 32 /dev/nvme0n1p2

Do not forget to use cryptsetup to encrypt your /dev/nvme0n1p4 partition! In the rest of the article, I’ll be using its device mapper representation.

# cryptsetup luksFormat -s 512 /dev/nvme0n1p4
# cryptsetup luksOpen /dev/nvme0n1p4 root
# mkfs.ext4 /dev/mapper/root

Then follow the Gentoo handbook as usual for the stage3 related next steps. Make sure you mount and bind the following to your /mnt/gentoo LiveCD installation folder (the /sys binding is important for grub UEFI):

# mount -t proc none /mnt/gentoo/proc
# mount -o bind /dev /mnt/gentoo/dev
# mount -o bind /sys /mnt/gentoo/sys

make.conf settings

I strongly recommend using at least the following on your /etc/portage/make.conf :

INPUT_DEVICES="evdev synaptics"
VIDEO_CARDS="intel i965"

USE="bindist cryptsetup"

The GRUB_PLATFORM one is important for later grub setup and the cryptsetup USE flag will help you along the way.

fstab for SSD

Don’t forget to make sure the noatime option is used on your fstab for / and /home.

/dev/nvme0n1p2    /boot    vfat    noauto,noatime    1 2
/dev/nvme0n1p3    none     swap    sw                0 0
/dev/mapper/root  /        ext4    noatime   0 1

Kernel configuration and compilation

I suggest you use a recent >=sys-kernel/gentoo-sources-4.13.0 along with genkernel.

  • You can download my kernel configuration file (iptables, docker, luks & stuff included)
  • Put the kernel configuration file into the /etc/kernels/ directory (with a training s)
  • Rename the configuration file with the exact version of your kernel

Then you’ll need to configure genkernel to add luks support, firmware files support and keymap support if your keyboard layout is not QWERTY.

In your /etc/genkernel.conf, change the following options:


Then run genkernel all to build your kernel and luks+firmware+keymap aware initramfs.

Grub UEFI bootloader with LUKS and custom keymap support

Now it’s time for the grub magic to happen so you can boot your wonderful Gentoo installation using UEFI and type your password using your favourite keyboard layout.

  • make sure your boot vfat partition is mounted on /boot
  • edit your /etc/default/grub configuration file with the following:
GRUB_CMDLINE_LINUX="crypt_root=/dev/nvme0n1p4 keymap=fr"

This will allow your initramfs to know it has to read the encrypted root partition from the given partition and to prompt for its password in the given keyboard layout (french here).

Now let’s install the grub UEFI boot files and setup the UEFI BIOS partition.

# grub-install --efi-directory=/boot --target=x86_64-efi /dev/nvme0n1
Installing for x86_64-efi platform.
Installation finished. No error reported

It should report no error, then we can generate the grub boot config:

# grub-mkconfig -o /boot/grub/grub.cfg

You’re all set!

You will get a gentoo UEFI boot option, you can disable the Microsoft Windows one from your BIOS to get straight to the point.

Hope this helped!

py3status v3.6

After four months of cool contributions and hard work on normalization and modules’ clean up, I’m glad to announce the release of py3status v3.6!

Milestone 3.6 was mainly focused about existing modules, from their documentation to their usage of the py3 helper to streamline their code base.

Other improvements were made about error reporting while some sneaky bugs got fixed along the way.


Not an extensive list, check the changelog.

  • LOTS of modules streamlining (mainly the hard work of @lasers)
  • error reporting improvements
  • py3-cmd performance improvements

New modules

  • i3blocks support (yes, py3status can now wrap i3blocks thanks to @tobes)
  • cmus module: to control your cmus music player, by @lasers
  • coin_market module: to display custom cryptocurrency data, by @lasers
  • moc module: to control your moc music player, by @lasers

Milestone 3.7

This milestone will give a serious kick into py3status performance. We’ll do lots of profiling and drastic work to reduce py3status CPU and memory footprints!

For now we’ve been relying a lot on threads, which is simple to operate but not that CPU/memory friendly. Since i3wm users rightly care for their efficiency we think it’s about time we address this kind of points in py3status.

Stay tuned, we have some nice ideas in stock 🙂

Thanks contributors!

This release is their work, thanks a lot guys!

  • aethelz
  • alexoneill
  • armandg
  • Cypher1
  • docwalter
  • enguerrand
  • fmorgner
  • guiniol
  • lasers
  • markrileybot
  • maximbaz
  • tablet-mode
  • paradoxisme
  • ritze
  • rixx
  • tobes
  • valdur55
  • vvoland
  • yabbes

ScyllaDB meets Gentoo Linux

I am happy to announce that my work on packaging ScyllaDB for Gentoo Linux is complete!

Happy or curious users are very welcome to share their thoughts and ping me to get it into portage (which will very likely happen).

Why Scylla?

Ever heard of the Cassandra NoSQL database and Java GC/Heap space problems?… if you do, you already get it 😉

I will not go into the details as their website does this way better than me but I got interested into Scylla because it fits the Gentoo Linux philosophy very well. If you remember my writing about packaging Rethinkdb for Gentoo Linux, I think that we have a great match with Scylla as well!

  • it is written in C++ so it plays very well with emerge
  • the code quality is so great that building it does not require heavy patching on the ebuild (feels good to be a packager)
  • the code relies on system libs instead of bundling them in the sources (hurrah!)
  • performance tuning is handled by smart scripting and automation, allowing the relationship between the project and the hardware is strong

I believe that these are good enough points to go further and that such a project can benefit from a source based distribution like Gentoo Linux. Of course compiling on multiple systems is a challenge for such a database but one does not improve by staying in their comfort zone.

Upstream & contributions

Packaging is a great excuse to get to know the source code of a project but more importantly the people behind it.

So here I got to my first contributions to Scylla to get Gentoo Linux as a detected and supported Linux distribution in the different scripts and tools used to automatically setup the machine it will run upon (fear not, I contributed bash & python, not C++)…

Even if I expected to contribute using Github PRs and got to change my habits to a git-patch+mailing list combo, I got warmly welcomed and received positive and genuine interest in the contributions. They got merged quickly and thanks to them you can install and experience Scylla in Gentoo Linux without heavy patching on our side.

Special shout out to Pekka, Avi and Vlad for their welcoming and insightful code reviews!

I’ve some open contributions about pushing further on the python code QA side to get the tools to a higher level of coding standards. Seeing how upstream is serious about this I have faith that it will get merged and a good base for other contributions.

Last note about reaching them is that I am a bit sad that they’re not using IRC freenode to communicate (I instinctively joined #scylla and found myself alone) but they’re on Slack (those “modern folks”) and pretty responsive to the mailing lists 😉

Java & Scylla

Even if scylla is a rewrite of Cassandra in C++, the project still relies on some external tools used by the Cassandra community which are written in Java.

When you install the scylla package on Gentoo, you will see that those two packages are Java based dependencies:

  • app-admin/scylla-tools
  • app-admin/scylla-jmx

It pained me a lot to package those (thanks to help of @monsieurp) but they are building and working as expected so this gets the packaging of the whole Scylla project pretty solid.

emerge dev-db/scylla

The scylla packages are located in the ultrabug overlay for now until I test them even more and ultimately put them in production. Then they’ll surely reach the portage tree with the approval of the Gentoo java team for the app-admin/ packages listed above.

I provide a live ebuild (scylla-9999 with no keywords) and ebuilds for the latest major version (2.0_rc1 at time of writing).

It’s as simple as:

$ sudo layman -a ultrabug
$ sudo emerge -a dev-db/scylla
$ sudo emerge --config dev-db/scylla

Try it out and tell me what you think, I hope you’ll start considering and using this awesome database!

Hardening SSH authentication using Yubikey (3/2)

In my previous blog post, I demonstrated how to use the PIV feature of a Yubikey to add a 2nd factor authentication to SSH.

Careful readers such as Grzegorz Kulewski pointed out that using the GPG capability of the Yubikey was also a great, more versatile and more secure option on the table (I love those community insights):

  • GPG keys and subkeys are indeed more flexible and can be used for case-specific operations (signing, encryption, authentication)
  • GPG is more widely used and one could use their Yubikey smartcard for SSH, VPN, HTTP auth and code signing
  • The Yubikey 4 GPG feature supports 4096 bit keys (limited to 2048 for PIV)

While I initially looked at the GPG feature, its apparent complexity got me to discard it for my direct use case (SSH). But I couldn’t resist the good points of Grzegorz and here I got back into testing it. Thank you again Grzegorz for the excuse you provided 😉

So let’s get through with the GPG feature of the Yubikey to authenticate our SSH connections. Just like the PIV method, this one has the  advantage to allow a 2nd factor authentication while using the public key authentication mechanism of OpenSSH and thus does not need any kind of setup on the servers.

Method 3 – SSH using Yubikey and GPG


The first choice you have to make is to decide whether you allow your master key to be stored on the Yubikey or not. This choice will be guided by how you plan to use and industrialize your usage of the GPG based SSH authentication.

Consider this to choose whether to store the master key on the Yubikey or not:

  • (con) it will not allow the usage of the same GPG key on multiple Yubikeys
  • (con) if you loose your Yubikey, you will have to revoke your entire GPG key and start from scratch (since the secret key is stored on the Yubikey)
  • (pro) by storing everything on the Yubikey, you won’t necessary have to have an offline copy of your master key (and all the process that comes with it)
  • (pro) it is easier to generate and store everything on the key and is then a good starting point for new comers or rare GPG users

Because I want to demonstrate and enforce the most straightforward way of using it, I will base this article on generating and storing everything on a Yubikey 4. You can find useful links at the end of the article pointing to reference on how to do it differently.

Tools installation

For this to work, we will need some tools on our local machine to setup our Yubikey correctly.

Gentoo users should install those packages:

emerge -av dev-libs/opensc sys-auth/ykpers app-crypt/ccid sys-apps/pcsc-tools app-crypt/gnupg

Gentoo users should also allow the pcscd service to be hotplugged (started automatically upon key insertion) by modifying their /etc/rc.conf and having:


Yubikey setup

The idea behind the Yubikey setup is to generate and store the GPG keys directly on our Yubikey and to secure them via a PIN code (and an admin PIN code).

  • default PIN code: 123456
  • default admin PIN code: 12345678

First, insert your Yubikey and let’s change its USB operating mode to OTP+U2F+CCID with MODE_FLAG_EJECT flag.

ykpersonalize -m86
Firmware version 4.3.4 Touch level 783 Program sequence 3

The USB mode will be set to: 0x86

Commit? (y/n) [n]: y

NOTE: if you have an older version of Yubikey (before Sept. 2014), use -m82 instead.

Then, we can generate a new GPG key on the Yubikey. Let’s open the smartcard for edition.

gpg --card-edit --expert

Reader ...........: Yubico Yubikey 4 OTP U2F CCID (0005435106) 00 00
Application ID ...: A7560001240102010006054351060000
Version ..........: 2.1
Manufacturer .....: Yubico
Serial number ....: 75435106
Name of cardholder: [not set]
Language prefs ...: [not set]
Sex ..............: unspecified
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: rsa2048 rsa2048 rsa2048
Max. PIN lengths .: 127 127 127
PIN retry counter : 3 0 3
Signature counter : 0
Signature key ....: [none]
Encryption key....: [none]
Authentication key: [none]
General key info..: [none]

Then switch to admin mode.

gpg/card> admin
Admin commands are allowed

We can start generating the Signature, Encryption and Authentication keys on the Yubikey. During the process, you will be prompted alternatively for the admin PIN and PIN.

gpg/card> generate 
Make off-card backup of encryption key? (Y/n) 

Please note that the factory settings of the PINs are
   PIN = '123456'     Admin PIN = '12345678'
You should change them using the command --change-pin

I advise you say Yes to the off-card backup of the encryption key.

Yubikey 4 users can choose a 4096 bits key, let’s go for it for every key type.

What keysize do you want for the Signature key? (2048) 4096
The card will now be re-configured to generate a key of 4096 bits
Note: There is no guarantee that the card supports the requested size.
      If the key generation does not succeed, please check the
      documentation of your card to see what sizes are allowed.
What keysize do you want for the Encryption key? (2048) 4096
The card will now be re-configured to generate a key of 4096 bits
What keysize do you want for the Authentication key? (2048) 4096
The card will now be re-configured to generate a key of 4096 bits

Then you’re asked for the expiration of your key. I choose 1 year but it’s up to you (leave 0 for no expiration).

Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0) 1y
Key expires at mer. 15 mai 2018 21:42:42 CEST
Is this correct? (y/N) y

Finally you give GnuPG details about your user ID and you will be prompted for a passphrase (make it strong).

GnuPG needs to construct a user ID to identify your key.

Real name: Ultrabug
Email address: ultrabug@nospam.com
You selected this USER-ID:
    "Ultrabug <ultrabug@nospam.com>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

If you chose to make an off-card backup of your key, you will also get notified of its location as well the revocation certificate.

gpg: Note: backup of card key saved to '/home/ultrabug/.gnupg/sk_8E407636C9C32C38.gpg'
gpg: key 22A73AED8E766F01 marked as ultimately trusted
gpg: revocation certificate stored as '/home/ultrabug/.gnupg/openpgp-revocs.d/A1580FD98C0486D94C1BE63B22A73AED8E766F01.rev'
public and secret key created and signed.

Make sure to store that backup in a secure and offline location.

You can verify that everything went good and take this chance to note the public key ID.

gpg/card> verify

Reader ...........: Yubico Yubikey 4 OTP U2F CCID (0001435106) 00 00
Application ID ...: A7560001240102010006054351060000
Version ..........: 2.1
Manufacturer .....: Yubico
Serial number ....: 75435106
Name of cardholder: [not set]
Language prefs ...: [not set]
Sex ..............: unspecified
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: rsa4096 rsa4096 rsa4096
Max. PIN lengths .: 127 127 127
PIN retry counter : 3 0 3
Signature counter : 4
Signature key ....: A158 0FD9 8C04 86D9 4C1B E63B 22A7 3AED 8E76 6F01
 created ....: 2017-05-16 20:43:17
Encryption key....: E1B6 7009 907D 1D94 B200 37D7 8E40 7636 C9C3 2C38
 created ....: 2017-05-16 20:43:17
Authentication key: AAED AB8E E055 41B2 EFFF 62A4 164F 873A 75D2 AD6B
 created ....: 2017-05-16 20:43:17
General key info..: pub rsa4096/22A73AED8E766F01 2017-05-16 Ultrabug <ultrabug@nospam.com>
sec> rsa4096/22A73AED8E766F01 created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
ssb> rsa4096/164F873A75D2AD6B created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
ssb> rsa4096/8E407636C9C32C38 created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106

You’ll find the public key ID on the “General key info” line (22A73AED8E766F01):

General key info..: pub rsa4096/22A73AED8E766F01 2017-05-16 Ultrabug <ultrabug@nospam.com>

Quit the card edition.

gpg/card> quit

It is then convenient to upload your public key to a key server, whether public or on your own web server (you can also keep it to be used and imported directly from an USB stick).

Export the public key:

gpg --armor --export 22A73AED8E766F01 > 22A73AED8E766F01.asc

Then upload it to your http server or a public server (needed if you want to be able to easily use the key on multiple machines):

# Upload it to your http server
scp 22A73AED8E766F01.asc user@server:public_html/static/22A73AED8E766F01.asc

# OR upload it to a public keyserver
gpg --keyserver hkps://hkps.pool.sks-keyservers.net --send-key 22A73AED8E766F01

Now we can finish up the Yubikey setup. Let’s edit the card again:

gpg --card-edit --expert

Reader ...........: Yubico Yubikey 4 OTP U2F CCID (0001435106) 00 00
Application ID ...: A7560001240102010006054351060000
Version ..........: 2.1
Manufacturer .....: Yubico
Serial number ....: 75435106
Name of cardholder: [not set]
Language prefs ...: [not set]
Sex ..............: unspecified
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: rsa4096 rsa4096 rsa4096
Max. PIN lengths .: 127 127 127
PIN retry counter : 3 0 3
Signature counter : 4
Signature key ....: A158 0FD9 8C04 86D9 4C1B E63B 22A7 3AED 8E76 6F01
 created ....: 2017-05-16 20:43:17
Encryption key....: E1B6 7009 907D 1D94 B200 37D7 8E40 7636 C9C3 2C38
 created ....: 2017-05-16 20:43:17
Authentication key: AAED AB8E E055 41B2 EFFF 62A4 164F 873A 75D2 AD6B
 created ....: 2017-05-16 20:43:17
General key info..: pub rsa4096/22A73AED8E766F01 2017-05-16 Ultrabug <ultrabug@nospam.com>
sec> rsa4096/22A73AED8E766F01 created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
ssb> rsa4096/164F873A75D2AD6B created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
ssb> rsa4096/8E407636C9C32C38 created: 2017-05-16 expires: 2018-05-16
 card-no: 0001 05435106
gpg/card> admin

Make sure that the Signature PIN is forced to request that your PIN is entered when your key is used. If it is listed as “not forced”, you can enforce it by entering the following command:

gpg/card> forcesig

It is also good practice to set a few more settings on your key.

gpg/card> login
Login data (account name): ultrabug

gpg/card> lang
Language preferences: en

gpg/card> name 
Cardholder's surname: Bug
Cardholder's given name: Ultra

Now we need to setup the PIN and admin PIN on the card.

gpg/card> passwd 
gpg: OpenPGP card no. A7560001240102010006054351060000 detected

1 - change PIN
2 - unblock PIN
3 - change Admin PIN
4 - set the Reset Code
Q - quit

Your selection? 1
PIN changed.

1 - change PIN
2 - unblock PIN
3 - change Admin PIN
4 - set the Reset Code
Q - quit

Your selection? 3
PIN changed.

1 - change PIN
2 - unblock PIN
3 - change Admin PIN
4 - set the Reset Code
Q - quit

Your selection? Q

If you uploaded your public key on your web server or a public server, configure it on the key:

gpg/card> url
URL to retrieve public key: http://ultrabug.fr/keyserver/22A73AED8E766F01.asc

gpg/card> quit

Now we can quit the gpg card edition, we’re done on the Yubikey side!

gpg/card> quit

SSH client setup

This is the setup on the machine(s) where you will be using the GPG key. The idea is to import your key from the card to your local keyring so you can use it on gpg-agent (and its ssh support).

You can skip the fetch/import part below if you generated the key on the same machine than you are using it. You should see it listed when executing gpg -k.

Plug-in your Yubikey and load the smartcard.

gpg --card-edit --expert

Then fetch the key from the URL to import it to your local keyring.

gpg/card> fetch

Then you’re done on this part, exit gpg and update/display& your card status.

gpg/card> quit

gpg --card-status

You can verify the presence of the key in your keyring:

gpg -K
sec>  rsa4096 2017-05-16 [SC] [expires: 2018-05-16]
      Card serial no. = 0001 05435106
uid           [ultimate] Ultrabug <ultrabug@nospam.com>
ssb>  rsa4096 2017-05-16 [A] [expires: 2018-05-16]
ssb>  rsa4096 2017-05-16 [E] [expires: 2018-05-16]

Note the “Card serial no.” showing that the key is actually stored on a smartcard.

Now we need to configure gpg-agent to enable ssh support, edit your ~/.gnupg/gpg-agent.conf configuration file and make sure that the enable-ssh-support is present:

default-cache-ttl 7200
max-cache-ttl 86400

Then you will need to update your ~/.bashrc file to automatically start gpg-agent and override ssh-agent’s environment variables. Add this at the end of your ~/.bashrc file (or equivalent).

# start gpg-agent if it's not running
# then override SSH authentication socket to use gpg-agent
pgrep -l gpg-agent &>/dev/null
if [[ "$?" != "0" ]]; then
 gpg-agent --daemon &>/dev/null
SSH_AUTH_SOCK=/run/user/$(id -u)/gnupg/S.gpg-agent.ssh

To simulate a clean slate, unplug your card then kill any running gpg-agent:

killall gpg-agent

Then plug back your card and source your ~/.bashrc file:

source ~/.bashrc

Your GPG key is now listed in you ssh identities!

ssh-add -l
4096 SHA256:a4vsJM6Sw1Rt8orvPnI8nvNUwHbRQ67ylnoTxruozK9 cardno:000105435106 (RSA)

You will now be able to get the SSH public key hash to copy to your remote servers using:

ssh-add -L
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCVDq24Ld/bOzc3yNnY6fF7FNfZnb6wRVdN2xMo1YiA5pz20y+2P1ynq0rb6l/wsSip0Owq4G6fzaJtT1pBUAkEJbuMvZBrbYokb2mZ78iFZyzAkCT+C9YQtvPClFxSnqSL38jBpunZuuiFfejM842dWdMNK3BTcjeTQdTbmY+VsVOw7ppepRh7BWslb52cVpg+bHhB/4H0BUUd/KHZ5sor0z6e1OkqIV8UTiLY2laCCL8iWepZcE6n7MH9KItqzX2I9HVIuLCzhIib35IRp6d3Whhw3hXlit4/irqkIw0g7s9jC8OwybEqXiaeQbmosLromY3X6H8++uLnk7eg9RtCwcWfDq0Qg2uNTEarVGgQ1HXFr8WsjWBIneC8iG09idnwZNbfV3ocY5+l1REZZDobo2BbhSZiS7hKRxzoSiwRvlWh9GuIv8RNCDss9yNFxNUiEIi7lyskSgQl3J8znDXHfNiOAA2X5kVH0s6AQx4hQP9Dl1X2Em4zOz+yJEPFnAvE+XvBys1yuUPq1c3WKMWzongZi8JNu51Yfj7Trm74hoFRn+CREUNpELD9JignxlvkoKAJpWVLdEu1bxJ7jh7kcMQfVEflLbfkEPLV4nZS4sC1FJR88DZwQvOudyS69wLrF3azC1Gc/fTgBiXVVQwuAXE7vozZk+K4hdrGq4u7Gw== cardno:000105435106

This is what ends up in ~/.ssh/authorized_keys on your servers.

When connecting to your remote server, you will be prompted for the PIN!


Using the GPG feature of your Yubikey is very convenient and versatile. Even if it is not that hard after all, it is interesting and fair to note that the PIV method is indeed more simple to implement.

When you need to maintain a large number of security keys in an organization and that their usage is limited to SSH, you will be inclined to stick with PIV if 2048 bits keys are acceptable for you.

However, for power users and developers, usage of GPG is definitely something you need to consider for its versatility and enhanced security.

Useful links

You may find those articles useful to setup your GPG key differently and avoid having the secret key tied to your Yubikey.

Hardening SSH authentication using Yubikey (2/2)

In my previous blog post, I demonstrated how to use a Yubikey to add a 2nd factor (2FA) authentication to SSH using pam_ssh and pam_yubico.

In this article, I will go further and demonstrate another method using Yubikey’s Personal Identity Verification (PIV) capability.

This one has the huge advantage to allow a 2nd factor authentication while using the public key authentication mechanism of OpenSSH and thus does not need any kind of setup on the servers.

Method 2 – SSH using Yubikey and PIV

Yubikey 4 and NEO also act as smartcards supporting the PIV standard which allows you to store a private key on your security key through PKCS#11. This is an amazing feature which is also very good for our use case.

Tools installation

For this to work, we will need some tools on our local machines to setup our Yubikey correctly.

Gentoo users should install those packages:

emerge dev-libs/opensc sys-auth/ykpers sys-auth/yubico-piv-tool sys-apps/pcsc-lite app-crypt/ccid sys-apps/pcsc-tools sys-auth/yubikey-personalization-gui

Gentoo users should also allow the pcscd service to be hotplugged (started automatically upon key insertion) by modifying their /etc/rc.conf and having:


Yubikey setup

The idea behind the Yubikey setup is to generate and store a private key in our Yubikey and to secure it via a PIN code.

First, insert your Yubikey and let’s change its USB operating mode to OTP+CCID.

ykpersonalize -m2
Firmware version 4.3.4 Touch level 783 Program sequence 3

The USB mode will be set to: 0x2

Commit? (y/n) [n]: y

Then, we will create a new management key:

key=`dd if=/dev/random bs=1 count=24 2>/dev/null | hexdump -v -e '/1 "%02X"'`
echo $key

Replace the default management key (if prompted, copy/paste the key printed above):

yubico-piv-tool -a set-mgm-key -n $key --key 010203040506070801020304050607080102030405060708

Then change the default PIN code and PUK code of your Yubikey

yubico-piv-tool -a change-pin -P 123456 -N <NEW PIN>

yubico-piv-tool -a change-puk -P 12345678 -N <NEW PUK>

Now that your Yubikey is secure, let’s proceed with the PCKS#11 certificate generation. You will be prompted for your management key that you generated before.

yubico-piv-tool -s 9a -a generate -o public.pem -k

Then create a self-signed certificate (only used for libpcks11) and import it in the Yubikey:

yubico-piv-tool -a verify-pin -a selfsign-certificate -s 9a -S "/CN=SSH key/" -i public.pem -o cert.pem
yubico-piv-tool -a import-certificate -s 9a -i cert.pem

Here you are! You can now export your public key to use with OpenSSH:

ssh-keygen -D opensc-pkcs11.so -e
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCWtqI37jwxYMJ9XLq9VwHgJlhZViPVAGIUfMm8SAlfs6cka4Cj570lkoGK04r8JAVJFy/iKfhGpL9N9XuartfIoq6Cg/6Qvg3REupuqs51V2cBaC/gnWIQ7qZqlzBulvcOvzNfHFD/lX42J58+E8tWnYg6GzIsImFZQVpmI6SxNfSmVQIqxIufInrbQaI+pKXntdTQC9wyNK5FAA8TXAdff5ZDnmetsOTVble9Ia5m6gqM7MnxNZ56uDpn+6lCxRZSW+Ln2PDE7sivVcST4qpfwY4P4Lrb3vrjCGODFg4xmGNKXsLi2+uZbs5rW7bg4HFO50kKDucPV1M+rBWA9999

Copy to your servers your SSH public key to your usual ~/.ssh/authorized_keys file in your $HOME.

Testing PIV secured SSH

Plug-in your Yubikey, and then SSH to your remote server using the opensc-pkcs11 library. You will be prompted for your PIN and then successfully logged in 🙂

ssh -I opensc-pkcs11.so cheetah
Enter PIN for 'PIV_II (PIV Card Holder pin)':

You can then configure SSH to use it by default for all your hosts in your ~/.ssh/config

PKCS11Provider /usr/lib/opensc-pkcs11.so

Using PIV with ssh-agent

You can also use ssh-agent to avoid typing your PIN every time.

When asked for the passphrase, enter your PIN:

ssh-add -s /usr/lib/opensc-pkcs11.so
Enter passphrase for PKCS#11: 
Card added: /usr/lib/opensc-pkcs11.so

You can verify that it worked by listing the available keys in your ssh agent:

ssh-add -L
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCWtqI37jwxYMJ9XLq9VwHgJlhZViPVAGIUfMm8SAlfs6cka4Cj570lkoGK04r8JAVJFy/iKfhGpL9N9XuartfIoq6Cg/6Qvg3REupuqs51V2cBaC/gnWIQ7qZqlzBulvcOvzNfHFD/lX42J58+E8tWnYg6GzIsImFZQVpmI6SxNfSmVQIqxIufInrbQaI+pKXntdTQC9wyNK5FAA8TXAdff5ZDnmetsOTVble9Ia5m6gqM7MnxNZ56uDpn+6lCxRZSW+Ln2PDE7sivVcST4qpfwY4P4Lrb3vrjCGODFg4xmGNKXsLi2+uZbs5rW7bg4HFO50kKDucPV1M+rBWA9999 /usr/lib64/opensc-pkcs11.so


Now you have a flexible yet robust way to authenticate your users which you can also extend by adding another type of authentication on your servers using PAM.

Hardening SSH authentication using Yubikey (1/2)

I recently worked a bit at how we could secure better our SSH connections to our servers at work.

So far we are using the OpenSSH public key only mechanism which means that there is no password set on the servers for our users. While this was satisfactory for a time we think that this still suffers some disadvantages such as:

  • we cannot enforce SSH private keys to have a passphrase on the user side
  • the security level of the whole system is based on the protection of the private key which means that it’s directly tied to the security level of the desktop of the users

This lead us to think about adding a 2nd factor authentication to SSH and about the usage of security keys.

Meet the Yubikey

Yubikeys are security keys made by Yubico. They can support multiple modes and work with the U2F open authentication standard which is why they got my attention.

I decided to try the Yubikey 4 because it can act as a smartcard while offering these interesting features:

  • Challenge-Response
  • OTP
  • GPG
  • PIV

Method 1 – SSH using pam_ssh + pam_yubico

The first method I found satisfactory was to combine pam_ssh authentication module along with pam_yubico as a 2nd factor. This allows server side passphrase enforcement on SSH and the usage of the security key to login.

TL;DR: two gotchas before we begin

ADVISE: keep a root SSH session to your servers while deploying/testing this so you can revert any change you make and avoid to lock yourself out of your servers.

Setup pam_ssh

Use pam_ssh on the servers to force usage of a passphrase on a private key. The idea behind pam_ssh is that the passphrase of your SSH key serves as your SSH password.

Generate your SSH key pair with a passphrase on your local machine.

ssh-keygen -f identity
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in identity.
Your public key has been saved in identity.pub.
The key fingerprint is:
SHA256:a2/HNCe28+bpMZ2dIf9bodnBwnmD7stO5sdBOV6teP8 alexys@yazd
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|                 |
|                o|
|            . ++o|
|        S    BoOo|
|         .  B %+O|
|        o  + %+*=|
|       . .. @ .*+|
|         ....%B.E|

You then must copy your private key (named identity with no extension) to your servers under  the ~/.ssh/login-keys.d/ folder.

In your $HOME on the servers, you will get something like this:

├── known_hosts
└── login-keys.d
    └── identity

Then you can enable the pam_ssh authentication. Gentoo users should enable the pam_ssh USE flag for sys-auth/pambase and re-install.

Add this at the beginning of the file /etc/pam.d/ssh

auth    required    pam_ssh.so debug

The debug flag can be removed after you tested it correctly.

Disable public key authentication

Because it takes precedence over the PAM authentication mechanism, you have to disable OpenSSH PubkeyAuthentication authentication on /etc/ssh/sshd_config:

PubkeyAuthentication no

Enable PAM authentication on /etc/ssh/sshd_config

ChallengeResponseAuthentication no
PasswordAuthentication no
UsePAM yes

Test pam_ssh

Now you should be prompted for your SSH passphrase to login through SSH.

➜  ~ ssh cheetah
SSH passphrase:

Setup 2nd factor using pam_yubico

Now we will make use of our Yubikey security key to add a 2nd factor authentication to login through SSH on our servers.

Because the Yubikey is not physically plugged on the server, we cannot use an offline Challenge-Response mechanism, so we will have to use a third party to validate the challenge. Yubico gracefully provide an API for this and the pam_yubico module is meant to use it easily.

Preparing your account using your Yubikey (on your machine)

First of all, you need to get your Yubico API key ID from the following URL:

You will get a Client ID (this you will use) and Secret Key (this you will keep safe).

Then you will need to create an authorization mapping file which basically link your account to a Yubikey fingerprint (modhex). This is equivalent to saying “this Yubikey belongs to this user and can authenticate him”.

First, get your modhex:

Using this modhex, create your mapping file named authorized_yubikeys which will be copied to ~/.yubico/authorized_yubikeys on the servers (replace LOGIN_USERNAME with your actual account login name).


NOTE: this mapping file can be a centralized one (in /etc for example) to handle all the users from a server. See the the authfile option on the doc.

Setting up OpenSSH (on your servers)

You must install pam_yubico on the servers. For Gentoo, it’s as simple as:

emerge sys-auth/pam_yubico

Copy your authentication mapping file to your home under the .yubico folder on all servers. You should get this:

└── authorized_yubikey

Configure pam to use pam_yubico. Add this after the pam_ssh on the file /etc/pam.d/ssh which should look like this now:

auth    required    pam_ssh.so
auth    required    pam_yubico.so id=YOUR_API_ID debug debug_file=/var/log/auth.log

The debug and debug_file flags can be removed after you tested it correctly.

Testing pam_yubico

Now you should be prompted for your SSH passphrase and then for your Yubikey OTP to login through SSH.

➜  ~ ssh cheetah
SSH passphrase: 
YubiKey for `ultrabug':

About the Yubico API dependency

Careful readers will notice that using pam_yubico introduces a strong dependency on the Yubico API availability. If the API becomes unreachable or your internet connection goes down then your servers would be unable to authenticate you!

The solution I found to this problem is to instruct pam to ignore the Yubikey authentication when pam_yubico is unable to contact the API.

In this case, the module will return a AUTHINFO_UNAVAIL code to PAM which we can act upon using the following syntax. The /etc/pam.d/ssh first lines should be changed to this:

auth    required    pam_ssh.so
auth    [success=done authinfo_unavail=ignore new_authtok_reqd=done default=die]    pam_yubico.so id=YOUR_API_ID debug debug_file=/var/log/auth.log

Now you can be sure to be able to use your Yubikey even if the API is down or unreachable 😉