Convert special characters to ASCII in python

I came across a recurrent problem at work which was to convert special characters such as the French-Latin accentuated letter “é” to ASCII “e” (this is called transliteration).

I wanted to avoid having to use an external library such as Unidecode (which is great obviously) so I ended up wandering around the unicodedata built-in library. Before I had to get too deep in the matter I found this StackOverflow topic which gives an interesting method to do so and works fine for me.

def strip_accents(s):
    """
    Sanitarize the given unicode string and remove all special/localized
    characters from it.

    Category "Mn" stands for Nonspacing_Mark
    """
    try:
        return ''.join(
            c for c in unicodedata.normalize('NFD', s)
            if unicodedata.category(c) != 'Mn'
        )
    except:
        return s

PS : thanks to @Flameeyes for his good remark on wording

2 thoughts on “Convert special characters to ASCII in python

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>