National language support
AIX Version 7.1 continues to extend the number of nations and regions supported by its national language support. In this chapter, details about the following features and facilities are provided:
10.1 Unicode 5.2 support
As part of the continuous ongoing effort to adhere to the most recent industry standards, AIX V7.1 provides the necessary enhancements to the existing Unicode locales in order to bring them up to compliance with the latest version of the Unicode standard, which is Version 5.2, as published by the Unicode Consortium.
The Unicode is a standard character coding system for supporting the worldwide interchange, processing, and display of the written texts of the diverse languages used throughout the world. Since November 2007 AIX V6.1 supports Unicode 5.0, which defines standardized character positions for over 99,000 glyphs in total. More than 8,000 additional code points have been defined in Unicode 5.1 (1624 code points, April 2008) and Unicode 5.2 (6,648 code points, October 2009). AIX V7.1 provides the necessary infrastructure to handle, store and transfer all Unicode 5.2 characters.
For in-depth information about Unicode 5.2, visit the official Unicode home page at:
10.2 Code set alias name support for iconv converters
National Language Support (NLS) provides a base for internationalization in which data often can be changed from one code set to another. Support of several standard converters for this purpose is provided by AIX, and the following conversion interfaces are offered by any AIX system:
iconv command Allows you to request a specific conversion by naming the FromCode and ToCode code sets.
libiconv functions Allows applications to request converters by name.
AIX can transfer, store, and convert data in more than 130 different code sets. In order to meet market requirements and standards, the number of code sets has been increased dramatically by different venders, organizations, and standard groups in the past decade. However, many code sets are maintained and named in different ways. This may raise code set alias name issues. A code set with a specific encoding scheme can have two or more different code set names in different platforms or applications.
For instance, ISO-8859-13 is an Internet Assigned Numbers Authority (IANA) registered code set for Estonian, a Baltic Latin language. The code set ISO-8859-13 is also named as IBM-921, CP921, ISO-IR-179, windows-28603, LATIN7, L7, 921, 8859_13 and 28603 in different platforms. For obvious interoperability reasons it is desirable to provide an alias name mapping function in the AIX /usr/lib/libiconv.a library to unambiguously identify code sets to the AIX converters.
AIX 7 introduces an AIX code set mapping mechanism in libiconv.a that holds more than 1300 code set alias names based on code sets and alias names of different vendors, applications, and open source groups. Major contributions are based on code sets related to the International Components for Unicode (ICU), Java, Linux, WebSphere®, and many others.
Using the new alias name mapping function, iconv can now easily map ISO-8859-13, CP921, ISO-IR-179, windows-28603, LATIN7, L7, 921, 8859_13 or 28603 to IBM-921 (AIX default) and convert the data properly, for example. The code set alias name support for iconv converters is entirely transparent to the system and no initialization or configuration is required on behalf of the system administrator.
10.3 NEC selected characters support in IBM-eucJP
There are 83 Japanese characters known as NEC selected characters. NEC selected characters refers to a proprietary encoding of Japanese characters historically established by the Nippon Electric Company (NEC) corporation. NEC selected characters have been supported by previous AIX releases through the IBM-943 and UTF-8 code sets.
For improved interoperability and configuration flexibility, AIX V7.1 and the related AIX V6.1 TL 6100-06 release extend the NEC selected characters support to the IBM-eucJP code set used for the AIX ja_JP local.
The corresponding AIX Japanese input method and the dictionary utilities were enhanced to accept NEC selected characters in the ja_JP local, and all IBM-eucJP code set related iconv converters were updated to handle the newly added characters.
Table 10-1 shows the local (language_territory designation) and code set combinations, all of which are now supporting NEC selected characters.
Table 10-1 Locales and code sets supporting NEC selected characters
Local
Local code set
Full local name
Category
JA_JP
UTF-8
JA_JP.UTF-8
Unicode
ja_JP
IBM-eucJP
ja_JP.IBM-eucJP
Extended UNIX Code (EUC)
Ja_JP
IBM-943
Ja_JP.IBM-943
PC
Requirements and specifications for Japanese character sets can be found at the official website of the Japanese Industrial Standards Committee:
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset