If you don’t understand notational systems and character mapping, you’ll be baffled when dealing with network configurations, color settings, drive sizing, and most other IT topics. This chapter covers CompTIA IT Fundamentals+ Objective 1.1: Compare and contrast notational systems: binary, hexadecimal, and decimal notational systems, and data representation including ASCII and Unicode character mapping.
Binary notation is the fundamental building block of all computer operations and data storage. Binary uses only two digits: 0 (off) and 1 (on). Binary notation is also known as base 2 notation. Each character in a binary number equals a bit (binary digit). To count in binary, use the example shown in Figure 2-1.
If you examine the binary equivalents for decimal 1, 2, 4, and 8, you will note that with each doubling of a value, an additional binary digit is used. This pattern continues with the binary equivalents for decimal 16, 32, 64, and so on.
As a shortcut to representing values with pure binary values, which can become very long and hard to read, you can use decimal numbers with powers of two exponents (2) instead. Table 2-1 compares decimal and binary values from 2 to 1024 (decimal), the equivalent power of two, and the formula represented by the power of two.
Decimal |
Binary |
Power of Two (Exponent) |
Power of Two (Multiplication) |
2 |
10 |
21 |
2 |
4 |
100 |
22 |
2×2 |
8 |
1000 |
23 |
2×2×2 |
16 |
10000 |
24 |
2×2×2×2 |
32 |
100000 |
25 |
2×2×2×2×2 |
64 |
1000000 |
26 |
2×2×2×2×2×2 |
128 |
10000000 |
27 |
2×2×2×2× 2×2×2 |
256 |
100000000 |
28 |
2×2×2×2× 2×2×2×2 |
512 |
1000000000 |
29 |
2×2×2×2×2×2×2×2×2 |
1024 |
10000000000 |
210 |
2×2×2×2×2×2×2×2×2×2 |
Note
Decimal values based on powers of two are used to describe the size of memory and storage devices. See Chapter 6, “Common Units of Measure: Storage, Throughput, and Speed,” for more information.
Hexadecimal (hex) notation, also known as base 16 notation, uses the following digits: 0–9 (equivalent to values 0–9 in decimal notation) and a–f or A–F (equivalent to values 10–15 in decimal notation), for a total of 16 digits. Figure 2-2 shows a representation of this.
Each single hex digit is equivalent to four bits (a nibble) in binary notation. Thus, a very long binary value can be represented by a much shorter hex value. Table 2-2 compares decimal, binary, and hex values.
Decimal |
Binary |
Hexadecimal |
2 |
10 |
2 |
3 |
11 |
3 |
5 |
101 |
5 |
9 |
1001 |
9 |
15 |
1111 |
F |
16 |
10000 |
10 |
32 |
100000 |
20 |
33 |
100001 |
21 |
64 |
1000000 |
40 |
65 |
1000001 |
41 |
128 |
10000000 |
80 |
129 |
10000001 |
81 |
255 |
100000000 |
FF |
256 |
100000000 |
100 |
511 |
111111111 |
1FF |
1024 |
10000000000 |
400 |
There are many places in computer programming, applications, and networking in which hex values are used. Here are just a few of them:
Expressing color values in Hypertext Markup Language (HTML), Cascading Style Sheets (CSS), and X Window System
Internet Protocol version 6 (IPv6) addresses
Media Access Control (MAC) addresses for networking devices
Note
For a useful tool to practice converting decimal to binary and even hex, see https://www.rapidtables.com/convert/number/decimal-to-binary.html?x=2.
Display colors (RGB) are expressed in three groups of two hex digits. The first group represents red color values, the second green, and the third blue. For example, use the color code #FF0000 for red (maximum red, no green, no blue). Blue is #0000FF (no red, no green, maximum blue). White is #FFFFFF (maximum red, maximum green, and maximum blue); when all colors of light are mixed together, the result is white. Black is #000000 (no red, no blue, and no green). Orange is a mixture of red and green: #FFA500.
Note
For illustrations of these and many other color codes, see https://www.computerhope.com/htmcolor.htm.
Internet Protocol version 6 (IPv6), which is replacing the older Internet Protocol version 4 (IPv4), uses hexadecimal numbering for its IP addresses in place of much longer and harder-to-read binary values. An IPv6 address is 128 bits long, comprising eight 16-bit sections. Table 2-3 provides an example of what an IPv6 address would look like in binary, and Figure 2-3 shows the normal hexadecimal notation used.
Binary Blocks |
|||
Block 1 |
Block 2 |
Block 3 |
Block 4 |
0010000000000001 |
0000000000000000 |
0011000110011000 |
1100111011110001 |
Block 5 |
Block 6 |
Block 7 |
Block 8 |
0000000001011001 |
0000000000000001 |
0000000000000000 |
1111101011111100 |
The normal numbering system used in everyday life is decimal, also known as base 10. Base 10 uses the following digits: 0–9. As you have seen in earlier sections of this chapter, computers use decimal, binary, or hexadecimal numbering systems to identify or size different components.
One of the most common places to see decimal values is when viewing the size of a storage device in a management interface such as the Windows Properties sheet for a drive. In Figure 2-4, the size of a hard drive and a USB flash drive in bytes are listed using decimal numbering.
Note
In Figure 2-4, you might have noted that the GB (gigabyte) sizes listed appear to be smaller than the number of bytes divided by 1 billion. That’s because binary gigabytes are based on multiples of 1024 (powers of 2) rather than 1000 (powers of ten). To learn more, see Chapter 6.
In addition to data storage, how data is represented is an important concept to understand. Text is stored as numeric codes, but these codes must be mapped to characters to make them understandable. There are two broad categories of character sets that have been used in computer storage:
ASCII
Unicode
The following sections compare and contrast the features of these character sets.
ASCII (American Standard Code for Information Interchange) is a 7-bit character set that includes 128 characters, of which 97 are printable. These include uppercase and lowercase English alphabet, numbers 0–9, and punctuation marks.
Note
The remaining 31 characters were set aside for device control sequences to control the teletype (TTY) machines that were used for transmitting and receiving data when ASCII was developed (early 1960s).
To enable a broader range of characters to be displayed, the ASCII character set has been extended to include characters such as trademark, copyright, currency symbols, additional mathematical symbols, and foreign-language characters with accents. The 255-character extended ASCII character set is sometimes referred to as the ANSI character set. Figure 2-5 illustrates a few of these extended characters.
To enter an ASCII or ANSI character directly from the keyboard, press and hold down an Alt key and type the character’s number on the keypad (you cannot use the numbers at the top of the keyboard).
The problem with both standard and extended ASCII character sets is that they cannot display characters used by languages that don’t use the Latin alphabet (A–Z). To enable operating systems to work with non-Latin alphabets or with languages that use Latin alphabets with accents, operating system and printer vendors developed code pages, which are language-specific collections of characters mapped to codes.
When you install an operating system, you are asked to select your region and language. Based on your answer, the operating system selects the correct code page for your region and language.
Unicode has replaced ASCII and extended ASCII character sets because it enables operating systems and printers to display and print characters in any language. Unicode supports ASCII, extended ASCII, and both Latin and non-Latin alphabets and special characters.
Note
ASCII and Unicode character encoding enable computers to store and exchange data with other computers and programs. For example, applications such as Windows Notepad and Microsoft Office make use of ASCII for formatting purposes. For more information, see https://www.asciitable.com/.
How many more characters does Unicode support? Here’s an example: The standard Windows font Segoe includes 216 printable/displayable characters when using the Windows: Western character set (extended ASCII). However, when you select the Unicode character set, the same font includes 576 printable/displayable characters.
Unicode enables a single font to make both Latin and non-Latin characters available. For example, the OpenType Myriad Hebrew font includes characters in the following alphabets when the Unicode character set is used: Hebrew, Latin standard and accented, Hangul (Korean) characters, Katakana (simplified Japanese), Khmer, Buginese (used in Indonesia), Mongolian, Glagolitic (Slavic), CJK (unified Chinese, Japanese, Korean ideographs), Yi (related to Tibetan), and others. Figure 2-6 illustrates a portion of these characters available through the Windows Character Map utility.
To add a character from an extended character set, you might use one or more of the following methods:
Press and hold the Alt key and enter the code for the character. For example, to add the cent sign symbol from the Verdana character set in Windows, press and hold Alt, then enter 0162 from the number pad.
Use a character-mapping utility to choose the character visually. Windows includes the Character Map utility shown in Figure 2-6. MacOS includes the Character Viewer. Linux distros that include the GNOME desktop typically include Gucharmap, while distros using KDE typically include the KCharSelect utility. You can also use the Java character map.
Review the most important topics in this chapter, noted with the Key Topics icon in the outer margin of the page. Table 2-4 lists these key topics and the page number on which each is found.
Key Topic Element |
Description |
Page Number |
Decimal values from 0-15 and their binary equivalents |
||
Decimal values from 0-25 and their hexadecimal equivalents |
||
Decimal 2 to 1024 in binary and hexadecimal |
||
Capacities for a hard drive and USB flash drive are given in bytes (decimal) and binary GB by Windows 10’s drive Properties sheets. |
||
Paragraph |
ASCII |
|
Paragraph |
Unicode |
Print a copy of Appendix A, “Memory Tables,” or at least the section from this chapter, and complete the tables and lists from memory. Appendix B, “Memory Tables Answer Key,” includes completed tables and lists to check your work.
Define the following key terms from this chapter and check your answers in the glossary:
Internet Protocol version 6 (IPv6)
1. Which of the following characters is not in the standard ASCII character set?
Z
!
¢
$
2. Which of the following statements is true about the relationship between ASCII and Unicode?
Unicode contains fewer characters than ASCII.
All fonts contain the same number of Unicode characters.
You can only use Unicode if you are not using a Latin alphabet.
Unicode contains all ASCII characters.
3. Decimal 15 is equivalent to which of the following in binary?
01
1101
15
1111
4. 25 equals which of the following in decimal?
16
32
64
24
5. Choose the smallest hex value from the following list.
CA
3D
AC
D3
6. Choose the largest hex value from the following list.
AF
DA
FA
AD
7. Which of the following represents a pure blue color in hex?
#00FF00
#FFFF00
#0000FF
#00F0FF
8. An IPv6 address contains how many bits?
32
128
64
256
9. Which of the following statements is true about the value 10 in decimal, hex, and binary?
10 hex is equal to 10 decimal.
10 binary is equal to 10 hex.
10 binary is larger than 10 decimal.
10 decimal is larger than 10 hex.
10. Which of the following is closest to 350,000,000,000?
350GB (binary)
326GB (binary)
350MB (binary)
365GB (binary)
11. Which of the following statements is true about the relationship of ASCII to Unicode?
ASCII and Unicode are identical.
Unicode is used only for non-ASCII characters.
Unicode is used for ASCII and non-ASCII characters.
Unicode contains only characters for one language.
12. To add a character that is not on the keyboard using Windows, which of the following methods can you use?
Use KCharSelect.
Hold down the Ctrl key and enter the code from the number pad.
Use Character Viewer.
Use Character Map.
13. The cent sign character is not available in which of the following character sets?
ANSI character set
ASCII character set
Unicode
Character Map
14. You get a call from a novice web programmer who wants to know how to specify RGB colors on a web page. Which of the following is correct?
Use hexadecimal values for red, green, and blue.
Use percentages for red, green, and blue.
Use binary values for red, green, and blue.
Use hexadecimal values for cyan, magenta, yellow, and black.
15. You are creating a web page and need to specify white as the font color in RGB. Which of the following is correct?
#FFFFFF
#000000
#WHITE
#AABBCC
16. Which of the following statements is correct about IPv6 addresses?
They are 64 bits long, with eight groups of eight bits each.
They are 128 bits long, with 16 groups of eight bits each.
They are 128 bits long, with eight groups of 16 bits each.
They are normally expressed in decimal octets.
17. Which of the following is a correct statement about base 10 numbering?
Base 10 uses decimal values (0–9).
Base 10 uses hexadecimal values (0–9, A–F).
Base 10 uses binary values (1 and 0).
Base 10 uses decimal values (A–I).
18. Which of the following is the correct binary value equal to 5 decimal?
111
010
011
101
19. If 24 is equal to 16, what is 27 equal to?
512
128
1024
64
20. Which of the following characters is found in an ANSI or Unicode character set, but not an ASCII character set?
~
|
^
+/-
What’s next? If you have a knack for converting between number systems, you might find programming a good fit. Before making a decision, check out the chapters on programming in this book.
For Java programming certifications (Oracle), see the following resources:
Java SE: http://education.oracle.com/pls/web_prod-plq-dad/ou_product_category.getPageCert?p_cat_id=267
Java EE and Web Services: http://education.oracle.com/pls/web_prod-plq-dad/ou_product_category.getPage?p_cat_id=264