Categories
Linux

State of the event log architecture enhancements

Interesting stuff is happening on the event log (syslog) community and more precisely on the topic of syslog format extension and structuring syslog data.

As of today there’s no real standard on how to format and structure data on a syslog message. Every project has its own log message structure and syntax (qmail and postfix don’t log a mail delivery failure the same way for example), so we rely on parsers to extract any given data from a log message because the syslog software has no way to do it for us. I for one have coded a postfix log parser and believe me it’s not a pleasant thing to do and maintain !

The main idea about structuring syslog messages is to represent them using JSON along with the current free form strings to prevent backward compatibility breakage. To achieve this, we need to normalize and extend this format so that syslog softwares such as rsyslog and syslog-ng can directly understand them. That’s where CEE-enhanced messages and Lumberjack kick in.

CEE-enhanced messages

The CEE project aims at defining a syntax which extends the current log message format while being compatible with all the currently and widely used log frameworks or the well known glibc’s syslog() call. To achieve this the main idea is to use what is called a cookie before the JSON representation of the data we want to pass to the syslog software.

To make it simple, let’s pretend we see this postfix log meaning that a queued mail has been removed from the queue (I removed the date etc to only focus on the message part) :

CAA3B607DA: removed

The equivalent CEE-enhanced message could (this would be up to postfix) be represented as :

@cee: {"id":"CAA3B607DA", "removed":"true"}
  • @cee: is what is called the cookie which tells the syslog software that this message is using the CEE-enhanced syntax

I guess you already see how handy this would be and how we could then rely on the syslog software to automagically use our favorite storage backend to store this structured data (think mongoDB).

More information on the handy and quick video presentation by Rainer Gerhards and his article about it.

The Lumberjack project

So now how do we format the JSON part ? Could we have other types such as booleans and integers directly interpreted by the syslog software ? Well this needs definitions and standardization proposals, that’s what project Lumberjack is for.

Have a nice read on Lumberjack origins on Rainer Gerhards’s blog.

Categories
Linux

Clustering : corosync v1.4.3 & pacemaker v1.1.7 released

I’ve finally taken the time to take care of the corosync and pacemaker ebuilds. The new versions are now available in portage.

Corosync 1.4.3 (10/04/2012)

This is one of the last supported old stable release of the Corosync Cluster Engine. FYI, I’ve also bumped the new corosync-2.0.0 version but it needs more testing before I hard-unmask it.

Pacemaker 1.1.7 (28/03/12)

This is a bug fix release of Pacemaker. See the changelog for details.

Special thanks to my fellow Gentoo Linux developer Kacper Kowalik (xarthisius) for his help on these bumps.

Categories
Linux

uWSGI : new ebuild in portage

I started to rework the uwsgi ebuild on March 7th because I was not satisfied with the one available in portage. The current version was out of date and the package itself was not really suited for production deployment.

Luckily my fellow Gentoo Linux developer Tiziano Müller (dev-zero) was also in the same kind of process for his own needs so we teamed up to achieve this goal. Our main focuses were :

  • Bring the emperor mode support
  • Ease and clarify the overall configuration
  • Code a more versatile init script and conf.d file
  • Add a better support of the available plugins and python versions
  • Support PHP

I’m glad to announce that our reworked ebuild is now available in portage for all users, we hope that it will come handy to everyone who needs it.

Thanks again Tiziano, it’s always a pleasure to work with you !

Categories
Linux

pymongo : v2.2 released

The mongoDB python driver pymongo was bumped to v2.2 and is now in portage.

Changelog highlights :

  • Support for Python 3
  • Support for Gevent
  • Improved connection pooling

See the complete changelog.

 

Categories
Linux

mongoDB : v2.0.5 released

This is a bug fix release of mongoDB, it is now live in portage as well.

+*mongodb-2.0.5 (11 May 2012)
+
+  11 May 2012; Ultrabug <ultrabug@gentoo.org> -mongodb-2.0.3.ebuild,
+  -files/mongodb-2.0.3-fix-scons.patch, +mongodb-2.0.5.ebuild:
+  Version bump, generic mms-agent URL, drop old.
+

Bug fix highlight :

  • Inconsistent query results on large data and result sets
  • Race during static destruction of CommitJob object

See the complete changelog.

Categories
Linux

uWSGI : network spooling of messages between applications

One of the great new uWSGI v1.1 features is network spooling of messages between applications. This short article demonstrates how to use it between a front end django app and a back end python app.

I advise you to use a uWSGI emperor and simply drop the provided ini files in its folder. The example is simple enough but here is an explanation of how it works.

  1. The sender is a django app which you call via your browser, the front end.
  2. The sender app uses the mashal module which permits to pass a type rich message (dictionary) through a string only spooling mechanism (yes, it’s very handy).
  3. The sender sends a type 17 message (spool request message) over the network providing the message.
  4. The receiver app is a standalone spooling application written in standard python, this would be the back end.
  5. The receiver just prints out what it received via the network spooling mechanism.

As I said, this is just an illustration of what can be done. You could look into uwsgidecorators and mix this with other stuff that suits your needs. Enjoy !

Categories
Linux

nginx : conditional uWSGI error handling

To ensure the best possible quality of service we want to make sure that we catch our uWSGI application failures on the nginx side and react accordingly. Our goal is to never serve a HTTP 500 error to visitors. I’ll show you how you can adapt nginx error handling behavior based on the URI called by the visitor.

nginx + uWSGI base configuration (nginx.conf)

Suppose we have the following configuration to handle our uWSGI apps. We have set our gateway timeouts to 10 seconds to make sure no request will take more than that time to be answered, no matter what our application do.

upstream uwsgi_app1  {
	server 127.0.0.1:1000;
}

location / {
	uwsgi_pass uwsgi_app1;
	include uwsgi_params;
	uwsgi_ignore_client_abort on;
	uwsgi_connect_timeout 10;
	uwsgi_read_timeout 10;
	uwsgi_send_timeout 10;
}

Static uWSGI error handling

Now  we don’t want nginx to reply to clients with errors such as 500 (our app crashed) and 504 (timeout has been triggered). At first, we’ll serve a simple 1×1 GIF pixel instead.

location = /px.gif {
	empty_gif;
}

upstream uwsgid  {
	server 127.0.0.1:1000;
}

location / {
	uwsgi_pass uwsgi_app1;
	include uwsgi_params;
	uwsgi_ignore_client_abort on;
	uwsgi_connect_timeout 10;
	uwsgi_read_timeout 10;
	uwsgi_send_timeout 10;

	uwsgi_intercept_errors on;
	error_page 500 504 /px.gif;
}

The uwsgi_intercept_errors directive tells nginx to handle errors from uWSGI. Then we just have to use the usual nginx error handling using the error_page directive which in our case calls for /px.gif, returning our 1×1 GIF pixel using the empty_gif nginx module.

Dynamic uWSGI error handling

Let’s go conditional, suppose we have two types or URLS :

  1. http://www.mysite.com/APP1?query=bar
  2. http://www.mysite.com/APP1?query=foo&redir=http://www.ultrabug.fr&word=bar

For URL #1, we want to serve the 1×1 pixel whereas for URL #2, when we receive the redir parameter, we want to redirect the visitor to exactly that URI.

It’s standard error_page handling remember ? Let’s use the named location feature to process the request.

location @uwsgi_errors {
	rewrite_log on;
	if ($arg_redir ~* (.+)) {
		set $redir $1;
		rewrite ^ $redir? redirect;
	}
	rewrite ^ /px.gif? redirect;
}

location / {
	uwsgi_pass uwsgi_app1;
	include uwsgi_params;
	uwsgi_ignore_client_abort on;
	uwsgi_connect_timeout 10;
	uwsgi_read_timeout 10;
	uwsgi_send_timeout 10;

	uwsgi_intercept_errors on;
	error_page 500 504 @uwsgi_errors;
}

Upon HTTP 500/504 error, the @uwsgi_errors location is called by nginx internals. Let’s detail its processing :

  • rewrite_log on : turn the rewriting logging on for debugging / monitoring reasons
  • $arg_redir ~* (.+) : $arg_PARAMETER is a neat way to get the value of the given GET parameter (redir in our case). The condition here means that if the parameter is present, we’ll use it and enter the condition.
  • rewrite ^ $redir? redirect : we call the rewrite module using the redirect method to send a HTTP 302 to the client with the value of the previously defined $redir variable which contains the URI of the redir parameter. The important part here is the question mark after the $redir variable which makes sure that the original URI parameters are stripped from the redirection URI.
  • rewrite ^ /px.gif? redirect : if no redir parameter was received, we redirect to the the px.gif as usual. The question mark has the same meaning as above.

That’s it, we managed to handle our uWSGI errors based on certain conditions. Of course we could go further and use more named locations for different types of HTTP errors and use more nginx variables and conditions but that’s up to you now !

Categories
Linux

Portage internals

Maintenant que nous savons ce qu’est Portage, comprenons simplement comment il fonctionne. Que se passe-t’il lorsque l’on veut installer un package, et d’ailleurs ça ressemble à quoi un package sous Gentoo ?

Les ebuilds

Les packages disponibles dans l’arbre portage sont représentés par des fichiers appelés ebuilds. Les ebuilds contiennent toutes les informations nécessaires à la manipulation du package en question par portage (où télécharger les sources, quelle licence protège le logiciel, quelle est l’URL du projet, etc…).

Pour toute action vis à vis d’un package, portage se base sur les informations des ebuilds correspondants. Je dis des ebuilds car un ebuild contient aussi la version du package qu’il représente. Il y a donc autant de fichiers ebuild que de versions disponibles d’un  package. Prenons l’exemple du package www-client/firefox :

$ ls /usr/portage/www-client/firefox

firefox-3.6.20.ebuild
firefox-3.6.22.ebuild
firefox-8.0.ebuild
firefox-9.0.ebuild

Les versions 3.6.20, 3.6.22, 8.0 et 9.0 sont donc disponibles sur portage. Si nous voulions des informations supplémentaires ou installer une de ces versions de firefox, portage n’aurait qu’à exécuter les instructions contenues dans le fichier ebuild correspondant, et voilà !

Quand Mozilla sortira firefox 10, un développeur ou contributeur Gentoo devra créer l’ebuild pour cette version afin qu’il soit disponible dans portage, il est donc crucial de tenir sa liste d’ebuilds à jour sur son système.

Synchroniser portage

Mettre à jour portage, c’est donc mettre à jour la liste des ebuilds disponibles sur son système !

# emerge --sync

Le fameux sync télécharge les nouveaux ebuilds et supprime les obsolètes pour nous, c’est grâce à cela que nous disposerons du nouveau firefox quand il sortira, et il en va bien sûr de même pour tous les packages.

Les développeurs et contributeurs Gentoo tiennent ensemble à jour un arbre portage commun qui est téléchargé et répliqué par des serveurs qu’on appelle mirrors (le terme mirroir signifie qu’ils contiennent une copie exacte de l’arbre de développement). Tous les utilisateurs répliquent à leur tour leur arbre portage local (par défaut dans /usr/portage/) en se connectant sur un de ces serveurs mirrors lors du sync.

A l’heure où j’écris ces lignes, le portage tree contient 15459 packages représentant 29931 ebuilds !

Categories
Linux

Portage basics

Les distributions Linux disposent toutes de ce qu’on appelle un gestionnaire de paquet dit package management system ou plus simplement package manager.

  • Un package manager est un ensemble de programmes et d’outils permettant l’automatisation de l’installation / mise à jour / configuration / désinstallation de logiciels sur un système.

Sous Gentoo Linux, le gestionnaire de paquet s’appelle portage. Il permet de manipuler les packages disponible sur notre système Gentoo.

  • Un package représente un logiciel disponible à travers le package manager. Selon les distributions il peut prendre différentes formes comme par exemple une archive compressée.

Portage

Portage est écrit en python et en bash. C’est sans conteste l’un des package manager les plus flexibles et performants car il offre des possibilités de personnalisation très fines des packages que l’on souhaite installer sur son système.

La liste des packages disponibles à l’installation est organisée dans une arborescence de dossiers, c’est ce qu’on appelle le portage tree. Le nom “arbre portage” fait référence à l’arborescence organisée par catégorie des packages. Cette arborescence est stockée par défaut dans le dossier /usr/portage/ dont voici un exemple :

/usr/portage/www-apache
/usr/portage/www-apps
/usr/portage/www-client
/usr/portage/www-misc

Dans chaque catégorie, on retrouve un dossier par package disponible :

/usr/portage/www-client/chromium
/usr/portage/www-client/firefox
/usr/portage/www-client/opera

On voit que firefox et opera font partie de la catégorie www-client, leur nom complet de package sous Gentoo est :

  • www-client/firefox
  • www-client/opera

Bien sûr, on pourra aussi les appeler par leur petit nom mais il est important de noter qu’il est possible que deux packages aient le même nom s’ils font partie d’une arborescence différente. Mieux vaut donc toujours les appeler par leur nom complet.

Les commandes

Toutes les commandes suivantes font partie de portage et permettent de le manipuler. La plus connue est sans aucun doute emerge.

/usr/bin/ebuild
/usr/bin/egencache
/usr/bin/emerge
/usr/bin/portageq
/usr/bin/quickpkg
/usr/bin/repoman

/usr/sbin/archive-conf
/usr/sbin/dispatch-conf
/usr/sbin/emaint
/usr/sbin/emerge-webrsync
/usr/sbin/env-update
/usr/sbin/etc-update
/usr/sbin/fixpackages
/usr/sbin/regenworld
/usr/sbin/update-env
/usr/sbin/update-etc

Dans un prochain post, je parlerai de l’utilisation des principales commandes de portage et de leur configuration.