String usage abounds in just about all types of applications. The
System.String
type does not derive from
System.ValueType
and is therefore considered a
reference type. The
string
alias is built
into C# and can be used instead of the full name.
The FCL does not stop with just the
string
class; there is also a
System.Text.StringBuilder
class for performing
string manipulations and the
System.Text.RegularExpressions
namespace for
searching strings. This chapter will cover the
string
class and the
System.Text.StringBuilder
class.
The System.Text.StringBuilder
class provides an
easy, performance friendly, method of manipulating
string
objects. This class duplicates much of the
functionality of a string
class. However, this
duplicated functionality provides a more efficient manipulation of
strings than is obtainable by using the string
class.
You
have
a
variable of type char
and wish to determine the
kind of character it contains—a letter, digit, number,
punctuation character, control character, separator character,
symbol, whitespace, or surrogate character. Similarly, you have a
string
variable and want to determine the kind of
character in one or more positions within this string.
Use the built-in static methods on the System.Char
structure shown here:
Char.IsControl
|
Char.IsDigit
|
Char.IsLetter
|
Char.IsNumber
|
Char.IsPunctuation
|
Char.IsSeparator
|
Char.IsSurrogate
|
Char.IsSymbol
|
Char.IsWhitespace
|
The following examples demonstrate how to use the methods shown in the Solution section in a function to return the kind of a character. First, create an enumeration to define the various types of characters:
public enum CharKind { Control, Digit, Letter, Number, Punctuation, Separator, Surrogate, Symbol, Whitespace, Unknown }
Next, create a method that contains the logic to determine the type
of a character and to return a CharKind
enumeration value indicating that
type:
public static CharKind GetCharKind(char theChar) { if (Char.IsControl(theChar)) { return CharKind.Control; } else if (Char.IsDigit(theChar)) { return CharKind.Digit; } else if (Char.IsLetter(theChar)) { return CharKind.Letter; } else if (Char.IsNumber(theChar)) { return CharKind.Number; } else if (Char.IsPunctuation(theChar)) { return CharKind.Punctuation; } else if (Char.IsSeparator(theChar)) { return CharKind.Separator; } else if (Char.IsSurrogate(theChar)) { return CharKind.Surrogate; } else if (Char.IsSymbol(theChar)) { return CharKind.Symbol; } else if (Char.IsWhiteSpace(theChar)) { return CharKind.Whitespace; } else { return CharKind.Unknown; } }
If, however, a character in a string needs to be evaluated, use the
overloaded static methods on the Char
structure.
The following code modifies the GetCharKind
method
to accept a string
variable and a character
position in that string. The character position determines which
character in the string is
evaluated:
public static CharKind GetCharKindInString(string theString, int charPosition) { if (Char.IsControl(theString, charPosition)) { return CharKind.Control; } else if (Char.IsDigit(theString, charPosition)) { return CharKind.Digit; } else if (Char.IsLetter(theString, charPosition)) { return CharKind.Letter; } else if (Char.IsNumber(theString, charPosition)) { return CharKind.Number; } else if (Char.IsPunctuation(theString, charPosition)) { return CharKind.Punctuation; } else if (Char.IsSeparator(theString, charPosition)) { return CharKind.Separator; } else if (Char.IsSurrogate(theString, charPosition)) { return CharKind.Surrogate; } else if (Char.IsSymbol(theString, charPosition)) { return CharKind.Symbol; } else if (Char.IsWhiteSpace(theString, charPosition)) { return CharKind.Whitespace; } else { return CharKind.Unknown; } }
The GetCharKind
method accepts a character as a
parameter and performs a series of tests on that character using the
Char
type’s built-in static
methods. An enumeration of all the different types of characters is
defined and is returned by the GetCharKind
method.
Table 2-1 describes each of the static
Char
methods.
Table 2-1. Char methods
Char method |
Description |
---|---|
|
A control code in the ranges U007F, U0000-U001F, and U0080-U009F. |
|
Any decimal digit in the range 0-9. |
|
Any alphabetic letter. |
|
Any decimal digit or hexadecimal digit. |
|
Any punctuation character. |
|
A space separating words, a line separator, or a paragraph separator. |
|
Any surrogate character in the range UD800-UDFFF. |
|
Any mathematical, currency, or other symbol character. Includes characters that modify surrounding characters. |
|
Any space character and the following characters: U0009 U000A U000B U000C U000D U0085 U2028 U2029 |
The following code example determines whether the fifth character
(the charPosition
parameter is zero-based) in the
string is a digit:
if (GetCharKind("abcdefg", 4) == CharKind.Digit) {...}