Creating a Custom Locale

It is possible to pick from a great bunch of locales in GNU/Linux. However, sadly, none of these locales are a perfect fit for me. For example, I prefer to set the locale language to American English, but I use the metric system. Slightly more exotic, I prefer the YYYY-mm-dd date format, which is—so I believe—not used by any default locale. To set my preferences, I had to create and build my own locale file.

The system locale configuration can be found in the directory /usr/share/i18n/. The locale files themselves are stored in the folder /usr/share/i18n/locales/. To create use your own created locale, first create the folder /usr/local/share/i18n/ and the subfolder /usr/local/share/i18n/locales. Copy the locale you wish to alter to the newly created folder under a new name. I started from the nl_NL locale and named my new locale en_NL:

sudo mkdir -p /usr/local/share/i18n/locales
sudo cp /usr/share/i18n/locales/nl_NL /usr/local/share/i18n/locales/en_NL

Now modify the locale to your likening. Information about the fields can be found in the man page, man 5 locale. To ease modifying the files, you may want to use the following simple script. Call the script as ./convert.py en_US en_NL-converted to replace the Unicode code points to their corresponding strings. Use ./convert.py --encode en_NL-converted en_NL to convert strings between <U and > back to their code points. Instead of giving a file name, - may also be used to denote standard in/out.

#!/usr/bin/env python3
import argparse
import re
import sys

def code_point_replace(match):
    char = match.group()[2:-1]
    string = chr(int(char, 16))
    return string

def string_replace(match):
    string = match.group()[2:-1]
    output = ''
    for char in string:
        output += '<U' + format(ord(char), '04x') + '>'
    return output

parser = argparse.ArgumentParser(
        description='Convert Unicode code points in a locale file')
parser.add_argument('infile', nargs='?', type=argparse.FileType('r'),
        default=sys.stdin)
parser.add_argument('outfile', nargs='?', type=argparse.FileType('w'),
        default=sys.stdout)
parser.add_argument('--encode', '-e', help='encode strings between <S and >',
        action='store_true')
args = parser.parse_args()

for line in args.infile:
    if args.encode:
        args.outfile.write(re.sub(r'<S[^>]+>', string_replace, line))
    else:
        args.outfile.write(re.sub(r'<U[0-9a-fA-F]{4}>', code_point_replace, line))

When done editing the file, create the file /usr/local/share/i18n/SUPPORTED containing a line describing your locale, like this one below.

en_NL.UTF-8 UTF-8

Add the same line to the file /etc/locale.gen. Next, generate the locales from the files and check that they are there.

sudo locale-gen
locale -a

Finally, change your desktop environment to use your new locale. In my case, this is GNOME.

gsettings set org.gnome.system.locale region 'en_NL.utf8'