Support for UTF-8

Unicode Transformation Format-8 (UTF-8) is a character set that encapsulates all Unicode characters, using one to four eight-bit bytes. It is the byte-oriented encoded form of Unicode. UTF-8 is and has been the predominant character set for encoding web pages since 2009.

Here are some characteristics of UTF-8:

  • It can encode all 1,112,064 Unicode code points
  • It uses one to four eight-bit bytes
  • It accounts for nearly 90% of all web pages
  • It is backward compatible with ASCII
  • It is reversible

The pervasive use of UTF-8 underscores the importance of ensuring the Java platform fully supports UTF-8. With Java applications, we have the ability to specify property files that have UTF-8 encoding. The Java platform includes changes to the ResourceBundle API to support UTF-8.

Let's take a look at the premodern Java (Java 8 and earlier) ResourceBundle class, followed by what changes were made to this class in the modern Java platform.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset