Chapter 2

How Computers Store Data: Notational Systems

If you don’t understand notational systems and character mapping, you’ll be baffled when dealing with network configurations, color settings, drive sizing, and most other IT topics. This chapter covers CompTIA IT Fundamentals+ Objective 1.1: Compare and contrast notational systems: binary, hexadecimal, and decimal notational systems, and data representation including ASCII and Unicode character mapping.

Foundation Topics

Binary

Binary notation is the fundamental building block of all computer operations and data storage. Binary uses only two digits: 0 (off) and 1 (on). Binary notation is also known as base 2 notation. Each character in a binary number equals a bit (binary digit). To count in binary, use the example shown in Figure 2-1.

Image
Binary equivalents for decimal values 0 to 15 are shown.
Figure 2-1 Decimal Values from 0 to 15 and Their Binary Equivalents

Powers of Two

If you examine the binary equivalents for decimal 1, 2, 4, and 8, you will note that with each doubling of a value, an additional binary digit is used. This pattern continues with the binary equivalents for decimal 16, 32, 64, and so on.

As a shortcut to representing values with pure binary values, which can become very long and hard to read, you can use decimal numbers with powers of two exponents (2) instead. Table 2-1 compares decimal and binary values from 2 to 1024 (decimal), the equivalent power of two, and the formula represented by the power of two.

Table 2-1 Decimal 2 to 1024 in Binary and Power of Two

Decimal

Binary

Power of Two (Exponent)

Power of Two (Multiplication)

2

10

21

2

4

100

22

2×2

8

1000

23

2×2×2

16

10000

24

2×2×2×2

32

100000

25

2×2×2×2×2

64

1000000

26

2×2×2×2×2×2

128

10000000

27

2×2×2×2× 2×2×2

256

100000000

28

2×2×2×2× 2×2×2×2

512

1000000000

29

2×2×2×2×2×2×2×2×2

1024

10000000000

210

2×2×2×2×2×2×2×2×2×2

Note

Decimal values based on powers of two are used to describe the size of memory and storage devices. See Chapter 6, “Common Units of Measure: Storage, Throughput, and Speed,” for more information.

Hexadecimal

Hexadecimal (hex) notation, also known as base 16 notation, uses the following digits: 0–9 (equivalent to values 0–9 in decimal notation) and a–f or A–F (equivalent to values 10–15 in decimal notation), for a total of 16 digits. Figure 2-2 shows a representation of this.

Image
Hexadecimal equivalents for decimal values 0 to 15 are shown.
Figure 2-2 Decimal Values from 0 to 15 and Their Hexadecimal Equivalents

Each single hex digit is equivalent to four bits (a nibble) in binary notation. Thus, a very long binary value can be represented by a much shorter hex value. Table 2-2 compares decimal, binary, and hex values.

Image

Table 2-2 Decimal 2 to 1024 in Binary and Hexadecimal

Decimal

Binary

Hexadecimal

2

10

2

3

11

3

5

101

5

9

1001

9

15

1111

F

16

10000

10

32

100000

20

33

100001

21

64

1000000

40

65

1000001

41

128

10000000

80

129

10000001

81

255

100000000

FF

256

100000000

100

511

111111111

1FF

1024

10000000000

400

There are many places in computer programming, applications, and networking in which hex values are used. Here are just a few of them:

  • Expressing color values in Hypertext Markup Language (HTML), Cascading Style Sheets (CSS), and X Window System

  • Internet Protocol version 6 (IPv6) addresses

  • Media Access Control (MAC) addresses for networking devices

Note

For a useful tool to practice converting decimal to binary and even hex, see https://www.rapidtables.com/convert/number/decimal-to-binary.html?x=2.

Hex Color Values

Display colors (RGB) are expressed in three groups of two hex digits. The first group represents red color values, the second green, and the third blue. For example, use the color code #FF0000 for red (maximum red, no green, no blue). Blue is #0000FF (no red, no green, maximum blue). White is #FFFFFF (maximum red, maximum green, and maximum blue); when all colors of light are mixed together, the result is white. Black is #000000 (no red, no blue, and no green). Orange is a mixture of red and green: #FFA500.

Note

For illustrations of these and many other color codes, see https://www.computerhope.com/htmcolor.htm.

IPv6 Addresses

Internet Protocol version 6 (IPv6), which is replacing the older Internet Protocol version 4 (IPv4), uses hexadecimal numbering for its IP addresses in place of much longer and harder-to-read binary values. An IPv6 address is 128 bits long, comprising eight 16-bit sections. Table 2-3 provides an example of what an IPv6 address would look like in binary, and Figure 2-3 shows the normal hexadecimal notation used.

Table 2-3 IPv6 Address (Binary)

Binary Blocks

Block 1

Block 2

Block 3

Block 4

0010000000000001

0000000000000000

0011000110011000

1100111011110001

Block 5

Block 6

Block 7

Block 8

0000000001011001

0000000000000001

0000000000000000

1111101011111100

A figure shows the hexa decimal representation of the IPV6 address.
Figure 2-3 The Hexadecimal (Default) Representation of the IPv6 Address from Table 2-3

Decimal

The normal numbering system used in everyday life is decimal, also known as base 10. Base 10 uses the following digits: 0–9. As you have seen in earlier sections of this chapter, computers use decimal, binary, or hexadecimal numbering systems to identify or size different components.

One of the most common places to see decimal values is when viewing the size of a storage device in a management interface such as the Windows Properties sheet for a drive. In Figure 2-4, the size of a hard drive and a USB flash drive in bytes are listed using decimal numbering.

Image
Two screenshots of the Windows Properties Sheet for a hard drive and a USB flash drive are shown.
Figure 2-4 Capacities for a Hard Drive (Left) and USB Flash Drive (Right), Given in Bytes (Decimal) and Binary GB by Windows 10’s Drive Properties Sheets

Note

In Figure 2-4, you might have noted that the GB (gigabyte) sizes listed appear to be smaller than the number of bytes divided by 1 billion. That’s because binary gigabytes are based on multiples of 1024 (powers of 2) rather than 1000 (powers of ten). To learn more, see Chapter 6.

Data Representation

In addition to data storage, how data is represented is an important concept to understand. Text is stored as numeric codes, but these codes must be mapped to characters to make them understandable. There are two broad categories of character sets that have been used in computer storage:

  • ASCII

  • Unicode

The following sections compare and contrast the features of these character sets.

ASCII

Image

ASCII (American Standard Code for Information Interchange) is a 7-bit character set that includes 128 characters, of which 97 are printable. These include uppercase and lowercase English alphabet, numbers 0–9, and punctuation marks.

Note

The remaining 31 characters were set aside for device control sequences to control the teletype (TTY) machines that were used for transmitting and receiving data when ASCII was developed (early 1960s).

ANSI

To enable a broader range of characters to be displayed, the ASCII character set has been extended to include characters such as trademark, copyright, currency symbols, additional mathematical symbols, and foreign-language characters with accents. The 255-character extended ASCII character set is sometimes referred to as the ANSI character set. Figure 2-5 illustrates a few of these extended characters.

A listing of few extended ASCII (ANSI) characters and the keystrokes used to enter them.
Figure 2-5 A Few of the Extended ASCII (ANSI) Characters and the Keystrokes That Can Be Used to Enter Them

To enter an ASCII or ANSI character directly from the keyboard, press and hold down an Alt key and type the character’s number on the keypad (you cannot use the numbers at the top of the keyboard).

Code Pages

The problem with both standard and extended ASCII character sets is that they cannot display characters used by languages that don’t use the Latin alphabet (A–Z). To enable operating systems to work with non-Latin alphabets or with languages that use Latin alphabets with accents, operating system and printer vendors developed code pages, which are language-specific collections of characters mapped to codes.

When you install an operating system, you are asked to select your region and language. Based on your answer, the operating system selects the correct code page for your region and language.

Unicode

Image

Unicode has replaced ASCII and extended ASCII character sets because it enables operating systems and printers to display and print characters in any language. Unicode supports ASCII, extended ASCII, and both Latin and non-Latin alphabets and special characters.

Note

ASCII and Unicode character encoding enable computers to store and exchange data with other computers and programs. For example, applications such as Windows Notepad and Microsoft Office make use of ASCII for formatting purposes. For more information, see https://www.asciitable.com/.

How many more characters does Unicode support? Here’s an example: The standard Windows font Segoe includes 216 printable/displayable characters when using the Windows: Western character set (extended ASCII). However, when you select the Unicode character set, the same font includes 576 printable/displayable characters.

Unicode enables a single font to make both Latin and non-Latin characters available. For example, the OpenType Myriad Hebrew font includes characters in the following alphabets when the Unicode character set is used: Hebrew, Latin standard and accented, Hangul (Korean) characters, Katakana (simplified Japanese), Khmer, Buginese (used in Indonesia), Mongolian, Glagolitic (Slavic), CJK (unified Chinese, Japanese, Korean ideographs), Yi (related to Tibetan), and others. Figure 2-6 illustrates a portion of these characters available through the Windows Character Map utility.

A screenshot of the Character Map Utility window is depicted.
Figure 2-6 A Small Portion of the Characters Available in the OpenType Myriad Hebrew Unicode Character Set. The Highlighted Character Is a CJK Ideograph.

To add a character from an extended character set, you might use one or more of the following methods:

  • Press and hold the Alt key and enter the code for the character. For example, to add the cent sign symbol from the Verdana character set in Windows, press and hold Alt, then enter 0162 from the number pad.

  • Use a character-mapping utility to choose the character visually. Windows includes the Character Map utility shown in Figure 2-6. MacOS includes the Character Viewer. Linux distros that include the GNOME desktop typically include Gucharmap, while distros using KDE typically include the KCharSelect utility. You can also use the Java character map.

Exam Preparation Tasks

Review All Key Topics

Review the most important topics in this chapter, noted with the Key Topics icon in the outer margin of the page. Table 2-4 lists these key topics and the page number on which each is found.

Image

Table 2-4 Key Topics for Chapter 2

Key Topic Element

Description

Page Number

Figure 2-1

Decimal values from 0-15 and their binary equivalents

13

Figure 2-2

Decimal values from 0-25 and their hexadecimal equivalents

15

Table 2-2

Decimal 2 to 1024 in binary and hexadecimal

15

Figure 2-4

Capacities for a hard drive and USB flash drive are given in bytes (decimal) and binary GB by Windows 10’s drive Properties sheets.

17

Paragraph

ASCII

18

Paragraph

Unicode

19

Complete the Tables and Lists from Memory

Print a copy of Appendix A, “Memory Tables,” or at least the section from this chapter, and complete the tables and lists from memory. Appendix B, “Memory Tables Answer Key,” includes completed tables and lists to check your work.

Define Key Terms

Define the following key terms from this chapter and check your answers in the glossary:

ASCII

binary

decimal

hexadecimal

Internet Protocol version 6 (IPv6)

Unicode

Practice Questions for Objective 1.1

1. Which of the following characters is not in the standard ASCII character set?

  1. Z

  2. !

  3. ¢

  4. $

2. Which of the following statements is true about the relationship between ASCII and Unicode?

  1. Unicode contains fewer characters than ASCII.

  2. All fonts contain the same number of Unicode characters.

  3. You can only use Unicode if you are not using a Latin alphabet.

  4. Unicode contains all ASCII characters.

3. Decimal 15 is equivalent to which of the following in binary?

  1. 01

  2. 1101

  3. 15

  4. 1111

4. 25 equals which of the following in decimal?

  1. 16

  2. 32

  3. 64

  4. 24

5. Choose the smallest hex value from the following list.

  1. CA

  2. 3D

  3. AC

  4. D3

6. Choose the largest hex value from the following list.

  1. AF

  2. DA

  3. FA

  4. AD

7. Which of the following represents a pure blue color in hex?

  1. #00FF00

  2. #FFFF00

  3. #0000FF

  4. #00F0FF

8. An IPv6 address contains how many bits?

  1. 32

  2. 128

  3. 64

  4. 256

9. Which of the following statements is true about the value 10 in decimal, hex, and binary?

  1. 10 hex is equal to 10 decimal.

  2. 10 binary is equal to 10 hex.

  3. 10 binary is larger than 10 decimal.

  4. 10 decimal is larger than 10 hex.

10. Which of the following is closest to 350,000,000,000?

  1. 350GB (binary)

  2. 326GB (binary)

  3. 350MB (binary)

  4. 365GB (binary)

11. Which of the following statements is true about the relationship of ASCII to Unicode?

  1. ASCII and Unicode are identical.

  2. Unicode is used only for non-ASCII characters.

  3. Unicode is used for ASCII and non-ASCII characters.

  4. Unicode contains only characters for one language.

12. To add a character that is not on the keyboard using Windows, which of the following methods can you use?

  1. Use KCharSelect.

  2. Hold down the Ctrl key and enter the code from the number pad.

  3. Use Character Viewer.

  4. Use Character Map.

13. The cent sign character is not available in which of the following character sets?

  1. ANSI character set

  2. ASCII character set

  3. Unicode

  4. Character Map

14. You get a call from a novice web programmer who wants to know how to specify RGB colors on a web page. Which of the following is correct?

  1. Use hexadecimal values for red, green, and blue.

  2. Use percentages for red, green, and blue.

  3. Use binary values for red, green, and blue.

  4. Use hexadecimal values for cyan, magenta, yellow, and black.

15. You are creating a web page and need to specify white as the font color in RGB. Which of the following is correct?

  1. #FFFFFF

  2. #000000

  3. #WHITE

  4. #AABBCC

16. Which of the following statements is correct about IPv6 addresses?

  1. They are 64 bits long, with eight groups of eight bits each.

  2. They are 128 bits long, with 16 groups of eight bits each.

  3. They are 128 bits long, with eight groups of 16 bits each.

  4. They are normally expressed in decimal octets.

17. Which of the following is a correct statement about base 10 numbering?

  1. Base 10 uses decimal values (0–9).

  2. Base 10 uses hexadecimal values (0–9, A–F).

  3. Base 10 uses binary values (1 and 0).

  4. Base 10 uses decimal values (A–I).

18. Which of the following is the correct binary value equal to 5 decimal?

  1. 111

  2. 010

  3. 011

  4. 101

19. If 24 is equal to 16, what is 27 equal to?

  1. 512

  2. 128

  3. 1024

  4. 64

20. Which of the following characters is found in an ANSI or Unicode character set, but not an ASCII character set?

  1. ~

  2. |

  3. ^

  4. +/-

Your Next Steps

What’s next? If you have a knack for converting between number systems, you might find programming a good fit. Before making a decision, check out the chapters on programming in this book.

For Java programming certifications (Oracle), see the following resources:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset