Configuring Internationalization

This chapter explains how to configure the MME and its components to support different languages:

Language and locale settings

The MME gets its default language, which defines the strings it uses for display messages, as follows:

  1. From the active language set in the languages table.
  2. If there is no active language set in the languages table, from the value set by the <Locale> element in the MME configuration file.
  3. If no <Locale> element is found in the configuration file, from the value of CONFID_DEF_LOCALE. The default setting for CONFID_DEF_LOCALE is en for English.

The locale code is a string containing a 5-character language and region code. This consists of a 2-character ISO639-1 language code, followed by a “_” character, followed by a 2-character ISO3166-1 alpha-2 region code. See http://www.loc.gov/standards/iso639-2/php/code_list.php.

At present the MME uses only the first two characters of this code. The default language is English: en.


Note: For information about how to manage access to DVDs based on region codes, see Managing DVD access in the MME Developer's Guide chapter Playing and Managing Video and DVDs.

Setting the language preferences

To use a language other than English as the default language, you must populate the MME's languages table with the appropriate strings; see Adding languages below for more information. The MME will maintain the language you set across system shutdowns and restarts.

To change the language the MME will use for strings that indicate unknown media metadata locale and language, call the function mme_setlocale().

Setting the preferred playback language

Neither the configuration file language setting nor the language set with mme_setlocale() affect the preferred language for media playback.

Use mme_media_set_def_lang() to set the preferred language for media output. Typically, you should set this attribute soon after you connect to the MME, but you can do it at any time to change the preferences. To get the current preferred language, call mme_media_get_def_lang().

Adding languages

This section describes how to add supported languages to the MME.

Customizing display messages

You can customize the MME to display static message strings in languages other than English, by populating the languages table with translated strings, and then setting the active field to 1 for that row. For example, to set the language to Japanese, replace the default contents of the languages table with this:

UPDATE languages SET active=1 WHERE language='Japanese';

Different languages use different sorting rules even if they use the same alphabet. After making a language change, your client application will need to re-build the library table and its support tables using the sorting rules for the new language:

BEGIN TRANSACTION;  REINDEX ***; COMMIT;

Note: Rebuilding the library table takes time. Do not undertake this task unnecessarily.

Use the function mme_setlocale() to set the language for media with unknown language in the metadata, and the function mme_media_set_def_lang() to set the default language for media playback.

Setting language strings to NULL

You can set language strings to NULL. Language strings set to NULL can be coalesced into other values using the SQL engine (DMS or QDB), or they can just be received as NULL and handled by the client application. If you set langauge strings to NULL, your client application can detect when a language string has not been set up and populate it if necessary. To set a language string to NULL, simply insert the value into the languages table. For example:

-- English
INSERT INTO languages(
    language,
    lang_code,
    unknown,
    unknown_artist,
    unknown_album,
    unknown_genre,
    unknown_category,
    unknown_composer,
    synchronizing,
    unknown_language,
    unknown_conductor,
    unknown_soloist,
    unknown_ensemble,
    unknown_opus
    )
values(
    "English",
    "en",
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    );

Customizing the sort order

Different languages use different sort-order conventions. The MME includes support for setting the language for sorting data retrieved by SQL SELECT statements and sorted by the ORDER BY keyword. To enable this feature, you need to:

  1. Install the library libqdb_cldr.so along with the files for language sorting included with your MME release.
  2. Configure the QDB configuration file qdb.cfg.
  3. Start QDB with the -s option specifying the default sort order language.

If you do not specify a collation sequence based on a language and location, data will be sorted by using the default sorting method: the SQLite BINARY collation sequence.

Install libqdb_cldr.so and the files for language sorting definition

Install the library libqdb_cldr.so with the other system libraries (any location in the $LD_LIBRARY_PATH) on your target. You can install the files for language sorting definition to /etc/cldr; the libqdb_cldr.so library will search here by default. For example, your files might be installed as follows: lib/libqdb_cldr.so and /etc/cldr/language_files.

If you want to install the files for language sorting definition to a different location you can put the cldr directory (containing the definition files) along with the other configuration and definition files at the location you choose on your system, and set the path to this location in the $QDB_CLDR_PATH environment variable: QDB_CLDR_PATH= /my_config_file/cldr.

Configure qdb.cfg

Add the line Collation = cldr@libqdb_cldr.so to each section in the QDB configuration file qdb.cfg, telling the QDB to use the library libqdb_cldr.so for collation. For example:

[mme_library]
Filename		=	/fs/tmpfs/mme_library
Base Schema     =	/home/jmammone/mme/alpha125/run/db/mme_library.sql
Backup Dir		=	/home/jmammone/mme/alpha125/run/db/bks1
Backup Dir		=	/home/jmammone/mme/alpha125/run/db/bks2
Compression		=	bzip
Collation		=	cldr@libqdb_cldr.so

[mme_temp]
Filename		=	/fs/tmpfs/mme_temp.db
Schema File		=	/home/jmammone/mme/alpha125/run/db/mme_temp.sql
Backup Dir		=	/home/jmammone/mme/alpha125/run/db/bks1
Backup Dir		=	/home/jmammone/mme/alpha125/run/db/bks2
Collation		=	cldr@libqdb_cldr.so
...

Start QDB

To select the sort language, use the -s option when you start QDB. This option sets the language sort order in the variable cldr. For example, to set US English as the sort order:

# qdb -vv -R set -c ./qdb.cfg -s en_US@cldr

To set French (France) as the default language sort order, replace en_US@cldr with fr_FR@cldr, as follows:

# qdb -vv -R set -c ./qdb.cfg -s fr_FR@cldr

Caution: If your QDB configuration file is set up to use the new collation library libqdb_cldr.so, you must specify the -s xx_xx@cldr option when starting QDB. If you do not use the -s option, QDB may produce unexpected results and compromise the integrity of your system.

Using the specified sort order language

To use the language sort order specified in cldr, add COLLATE cldr after the ORDER BY clause in your SQL statement. For example:

SELECT artist_id,artist
    FROM library_artists ORDER BY artist COLLATE cldr;

Setting a default sort order language in the schema files

The MME offers a method for setting a default sort order language, to avoid having to add COLLATE cldr to SELECT statements already in use in a client application. You can add a line to the CREATE TABLE statement for each table in the MME schema files, then regenerate your tables. For example, to use the language set in cldr in the library_artists table:

CREATE TABLE library_artists (
    artist_id    INTEGER PRIMARY KEY,
    artist       TEXT UNIQUE COLLATE cldr
    );

Note: You can also change the sort order for keywords returned by SELECT statements by writing a collation function for the QDB. For more information, see Writing User-Defined Functions in the QDB Developer's Guide.

Creating an external DLL to provide character encoding routines

This section describes what you need to do to create an external DLL that provides character encoding routines for the MME and io-media.

About character encoding and conversion in the MME

All MME interfaces use UTF-8 character strings. In the majority of cases, the MME and io-media can use the information they have about media formats to correctly convert into UTF-8 characer strings the character strings in the media they process .

The mojibake problem

While the MME and io-media can usually convert character strings into UTF-8, depending on the media files your implementation will be required to handle, some character encodings may produce mojibake (unintelligible character strings) for the following reasons:

If your application is to support file formats whose specifications inadequately define encodings for character strings in your environment, the encodings that the MME and io-media propose by default may not be correct, and a custom DLL may be required to determine the actual encoding of the character strings.

Creating a character encoding conversion DLL

You can create an external DLL for the MME and io-media to:

This DLL should include:

See Character encoding conversion function prototypes below for more information about these functions.

Loading the DLL

The DLL can be used by the MME and by io-media.

Informing the MME about character conversion routines

After you have built the DLL, put it somewhere in your LD_LIBRARY_PATH search path, then use the <CharacterEncodingConverter> element under the <Database> element in the mme.conf file to tell the MME the name of the DLL. For example:

<CharacterEncodingConverter dll="convert_utf8.so"/>

If you do not set the LD_LIBRARY_PATH environment variable, you must specify the full path to your custom DLL in the <CharacterEncodingConverter> element.

Informing io-media about character conversion routines

If you use a custom routine to perform character conversions, you must specify the conversion DLL in the io-media configuration file, under the mmf module options. For example:

module-options {
    module = "mmf"
    audio_writer = "mmipc_writer"
    keepdlls = "used"
    dlldir = "$MM_INIT"
    utf8hook = "convert_utf8.so"
}

For more information about mmf module options, see mmf options in the MME Utilities Reference chapter on io-media.

How the MME and io-media will use the conversion DLL

If you create a custom character encoding DLL, and configure the MME and io-media to use it, the DLL is used as follows:

Character encoding conversion function prototypes

The MME defines two prototypes for functions you can write to detect and convert character encodings:

convert_setup()

int convert_setup( const char *default_encoding,
                   int allow_detection )

Arguments

default_encoding
A pointer to a parameter. You must define the format of this string, which can include any information of interest to your character conversion DLL.
allow_detection
A flag that determines if the MME and the character conversion DLL are permitted to perform encoding detection. If it is set to 1 detection is permitted; if it is set to 0, detection is not permitted.

Description

The function convert_setup() provides an interface for passing a string from the MME to custom routines to help them determine character encodings. It should accept this string, and a setting determining if character encoding detection is permitted. The string can be, for example, the name of an encoding, such as “8859-4” or “shift-JIS”, or a numeric value formatted as an ASCII string, or anything else that the HMI can provide to help the DLL determine what it should do with character strings.

The MME function mme_charconvert_setup() is designed to call convert_setup(). For more information see mme_charconvert_setup() in the MME API Reference.


Caution: The function convert_setup() may be called at any time. The MME provides no locking mechanism, so the DLL must maintain the integrity of its own variables.

If it is successful, this function should return 0 (zero). If this function is not successful, it should return -1.

convert_to_utf8()

int convert_to_utf8( const char *in_str,
                     char *out_str,
                     uint16_t in_size,
                     uint16_t out_size,
                     char *in_str_encoding_hint )

Arguments

n_str
A pointer to the input string in any encoding.
out_str
A pointer to the output string buffer to hold the resulting UTF-8 string.
in_size
The input size of the buffer, in bytes. If this parameter is 0, the buffer size is not known, and the DLL should call strlen() to find the buffer's length.
out_size
The output buffer size, in bytes.
in_str_encoding_hint
A hint at the encoding scheme. The hint is a string that will be null-terminated when called by the MME or io-media. See Character formats below for a list of defined input formats. For other input formats the MME and io-media pass NULL in in_str_encoding_hint.

Description

The function convert_to_utf8() should perform character conversions.

If it is successful, this function should return 0 (zero). If this function does not perform the requested conversion it should return -1: the MME and io-media will attempt to perform their default conversion.

You may choose to use this behavior to allow the MME to perform its default conversion with, for example, UTF-16. Assuming that UTF-16 strings are valid Unicode, your function can return a -1 when asked to convert these strings, thus passing on responsibility for the conversion to the MME and io-media.

Character formats

For convenience, mm/charconvert.h includes definitions for the following character formats:

These character formats are literal strings, so the character encoding conversion DLL should use strcmp() (not “==”) to compare them. They can be used to initialize character arrays. For example:

static const char format_iso[] = CHAR_FORMAT_ISO8859_1;