Using facets

Internationalization rules are known as facets. A locale object is a container of facets, and you can test if the locale has a specific facet using the has_facet function; if it does, you can get a const reference to the facet by calling the use_facet function. There are six types of facets summarized by seven categories of class in the following table. A facet class is a subclass of the locale::facet nested class.

Facet type	Description
`codecvt`, `ctype`	Converts between one encoding scheme to another and is used to classify characters and convert them to upper or lowercase
`collate`	Controls the ordering and grouping of characters in a string, including comparing and hashing of strings
`messages`	Retrieves localized messages from a catalog
`money`	Converts numbers representing currency to and from strings
`num`	Converts numbers to and from strings
`time`	Converts times and dates in numeric form to and from strings

The facet classes are used to convert the data to strings and so they all have a template parameter for the character type used. The money, num, and time facets are represented by three classes each. A class with the _get suffix that handles parsing strings, while a class with the _put suffix handles formatting as strings. For the money and num facets there is a class with the punct suffix that contains the rules and symbols for punctuation.

Since the _get facets are used to convert sequences of characters into numeric types, the classes have a template parameter that you can use to indicate the input iterator type that the get methods will use to represent a range of characters. Similarly, the _put facet classes have a template parameter that you can use to provide the output iterator type the put methods will write the converted string to. There are default types provided for both iterators types.

The messages facet is used for compatibility with POSIX code. The class is intended to allow you to provide localized strings for your application. The idea is that the strings in your user interface are indexed and at runtime you access the localized string using the index through the messages facet. However, Windows applications typically use message resource files compiled using the Message Compiler. It is perhaps for this reason that the messages facet provided as part of the Standard Library does not do anything, but the infrastructure is there, and you can derive your own messages facet class.

The has_facet and use_facet functions are templated for the specific type of facet that you want. All facet classes are subclasses of the locale::facet class, but through this template parameter the compiler will instantiate a function that returns the specific type you request. So, for example, if you want to format time and date strings for the French locale, you can call this code:

    locale loc("french"); 
    const time_put<char>& fac = use_facet<time_put<char>>(loc);

Here, the french string identifies the locale, and this is the language string used by the C Runtime Library setlocale function. The second line obtains the facet for converting numeric times into strings, and hence the function template parameter is time_put<char>. This class has a method called put that you can call to perform the conversion:

    time_t t = time(nullptr); 
    tm *td = gmtime(&t); 
    ostreambuf_iterator<char> it(cout); 
    fac.put(it, cout, ' ', td, 'x', '#'), 
    cout << "n";

The time function (via <ctime>) returns an integer with the current time and date, and this is converted to a tm structure using the gmtime function. The tm structure contains individual members for the year, month, day, hours, minutes, and seconds. The gmtime function returns the address to a structure that is statically allocated in the function, so you do not have to delete the memory it occupies.

The facet will format the data in the tm structure as a string through the output iterator passed as the first parameter. In this case, the output stream iterator is constructed from the cout object and so the facet will write the format stream to the console (the second parameter is not used, but because it is a reference you have to pass something, so the cout object is used there too). The third parameter is the separator character (again, this is not used). The fifth and (optional) sixth parameters indicate the formatting that you require. These are the same formatting characters as used in the C Runtime Library function strftime, as two single characters rather than the format string used by the C function. In this example, x is used to get the date and # is used as a modifier to get the long version of the string.

The code will give the following output:

    samedi 28 janvier 2017

Notice that the words are not capitalized and there is no punctuation, also notice the order: weekday name, day number, month, then year.

If the locale object constructor parameter is changed to german then the output will be:

    Samstag, 28. January 2017

The items are in the same order as in French, but the words are capitalized and punctuation is used. If you use turkish then the result is:

    28 Ocak 2017 Cumartesi

In this case, the day of the week is at the end of the string.

Two countries divided by a common language will give two different strings, and the following are the results for american and english-uk:

    Saturday, January 28, 2017
28 January 2017

Time is used as the example here because there is no stream, an insertion operator is used for the tm structure, and it is an unusual case. For other types, there are insertion operators that put them into a stream, and so the stream can use a locale to internationalize how it shows the type. For example, you can insert a double into the cout object and the value will be printed to the console. The default locale, American English, uses the period to separate whole numbers from the fractional part, but in other cultures a comma is used.

The imbue function will change the localization until the method is called subsequently:

    cout.imbue(locale("american")); 
    cout << 1.1 << "n"; 
    cout.imbue(locale("french")); 
    cout << 1.1 << "n"; 
    cout.imbue(locale::classic());

Here, the stream object is localized to US English and then the floating-point number 1.1 is printed on the console. Next, the localization is changed to French, and this time the console will show 1,1. In French, the decimal point is the comma. The last line resets the stream object by passing the locale returned from the static classic method. This returns the so-called C locale, which is the default in C and C++ and is American English.

The static method global can be used to set the locale that will be used as the default by each stream object. When an object is created from a stream class it calls the locale::global method to get the default locale. The stream clones this object so that it has its own copy independent of any local subsequently set by calling the global method. Note that the cin and cout stream objects are created before the main function is called, and these objects will use the default C locale until you imbue another locale. However, it is important to point out that, once a stream has been created, the global method has no effect on the stream, and imbue is the only way to change the locale used by the stream.

The global method will also call the C setlocale function to change the locale used by the C Runtime Library functions. This is important because some of the C++ functions (for example to_string, stod, as explained in the following text) will use the C Runtime Library functions to convert values. However, the C Runtime Library knows nothing about the C++ Standard Library, so calling the C setlocale function to change the default locale will not affect subsequently created stream objects.

It is worth pointing out that the basic_string class compares strings using the character traits class indicated by a template parameter. The string class uses the char_traits class and its version of the compare method does a straight comparison of the corresponding characters in the two strings. This comparison does not take into account cultural rules for comparing characters. If you want to do a comparison that uses cultural rules, you can do this through the collate facet:

    int compare( 
       const string& lhs, const string& rhs, const locale& loc) 
    { 
        const collate<char>& fac = use_facet<collate<char>>(loc); 
        return fac.compare( 
            &lhs[0], &lhs[0] + lhs.size(), &rhs[0], &rhs[0] + rhs.size()); 
    }

Table of Contents for Using facets

Create new playlist

Sign In

Sign Up

Table of Contents for
Using facets