The String
class
provides a host of methods for comparing , searching, and manipulating strings , the most important of which are shown in Table 15-1.
Table 15-1. String class properties and methods
Method or property | Explanation |
---|---|
| Property that returns the string indexer |
| Overloaded public static method that compares two strings |
| Public static method that creates a new string by copying another |
| Overloaded public static and instance method that determines if two strings have the same value |
| Overloaded public static method that formats a string using a format specification |
| Property that returns the number of characters in the instance |
| Right-aligns the characters in the string, padding to the left with spaces or a specified character |
| Left-aligns the characters in the string, padding to the right with spaces or a specified character |
| Deletes the specified number of characters |
| Divides a string, returning the substrings delimited by the specified characters |
| Indicates if the string starts with the specified characters |
| Retrieves a substring |
| Copies the characters from the string to a character array |
| Returns a copy of the string in lowercase |
| Returns a copy of the string in uppercase |
| Removes all occurrences of a set of specified characters from beginning and end of the string |
| Behaves like |
| Behaves like |
The Compare( )
method of
String
is overloaded. The first
version takes two strings and returns a negative number if the first
string is alphabetically before the second, a positive number if the
first string is alphabetically after the second, and zero if they are
equal. The second version works just like the first but is
case-insensitive. Example
15-1 illustrates the use of Compare( )
.
Example 15-1. Compare( ) method
using System; namespace StringManipulation { class Tester { public void Run( ) { // create some strings to work with string s1 = "abcd"; string s2 = "ABCD"; int result; // hold the results of comparisons // compare two strings, case sensitive result = string.Compare( s1, s2 ); Console.WriteLine( "compare s1: {0}, s2: {1}, result: {2} ", s1, s2, result ); // overloaded compare, takes boolean "ignore case" //(true = ignore case) result = string.Compare( s1, s2, true ); Console.WriteLine( "Compare insensitive. result: {0} ", result ); } static void Main( ) { Tester t = new Tester( ); t.Run( ); } } }
The output looks like this:
compare s1: abcd, s2: ABCD, result: -1 Compare insensitive. result: 0
Example 15-1
begins by declaring two strings, s1
and s2
, and initializing them with
string literals:
string s1 = "abcd"; string s2 = "ABCD";
Compare( )
is used with many
types. A negative return value indicates that the first parameter is
less than the second, a positive result indicates the first parameter
is greater than the second, and a zero indicates they are equal. In
Unicode (as in ASCII), a lowercase letter has a smaller value than an
uppercase letter; with strings identical except for case, lowercase
comes first alphabetically. Thus, the output properly indicates that
s1
(abcd) is “less than” s2
(ABCD):
compare s1: abcd, s2: ABCD, result: -1
The second comparison uses an overloaded version of Compare( )
, which takes a third Boolean
parameter, the value of which determines whether case should be
ignored in the comparison. If the value of this “ignore case”
parameter is true, the comparison is made without regard to case. This
time the result is 0, indicating that the two strings are
identical:
Compare insensitive. result: 0
There are a couple of ways to concatenate strings in C#.
You can use the Concat( )
method,
which is a static public method of the String
class:
string s3 = string.Concat(s1,s2);
or you can simply use the overloaded concatenation (+
) operator:
string s4 = s1 + s2;
Example 15-2 demonstrates both of these methods.
Example 15-2. Concatenation
using System; namespace StringManipulation { class Tester { public void Run( ) { string s1 = "abcd"; string s2 = "ABCD"; // concatenation method string s3 = string.Concat( s1, s2 ); Console.WriteLine( "s3 concatenated from s1 and s2: {0}", s3 ); // use the overloaded operator string s4 = s1 + s2; Console.WriteLine( "s4 concatenated from s1 + s2: {0}", s4 ); } static void Main( ) { Tester t = new Tester( ); t.Run( ); } } }
The output looks like this:
s3 concatenated from s1 and s2: abcdABCD s4 concatenated from s1 + s2: abcdABCD
In Example 15-2,
the new string s3
is created by
calling the static Concat( )
method
and passing in s1
and s2,
while the string s4
is created by using the overloaded
concatenation operator (+
) that
concatenates two strings and returns a string as a result.
There are two ways to copy strings. 99.9 percent of the time you will just write:
oneString = theOtherString;
and not worry about what is going on in memory.
There is a second, somewhat awkward way to copy strings:
myString = String.Copy(yourString);
and this actually does something subtly different. The difference is somewhat advanced, but here it is in a nutshell.
When you use the assignment operator (=
), you create a second reference to the
same object in memory, but when you use Copy
, you create a reference to a new string
that is initialized with the value of the first string.
“Huh?” I hear you cry. An example will make it clear (see Example 15-3).
Example 15-3. Copying strings
using System; namespace StringManipulation { class Tester { public void Run() { string s1 = "abcd"; Console.WriteLine( " string s1: {0}",s1 ); Console.WriteLine( " string s2 = s1; " ); string s2 = s1; Console.WriteLine( "s1: {0} s2: {1}", s1, s2 ); Console.WriteLine( "s1 == s2? {0}", s1 == s2 ); Console.WriteLine( "ReferenceEquals(s1,s2): {0}", ReferenceEquals( s1, s2 ) ); Console.WriteLine( " string s2 = string.Copy( s1 ); " ); string s3 = string.Copy( s1 ); Console.WriteLine( "s1: {0} s3: {1}", s1, s3 ); Console.WriteLine( "s1 == s3? {0}", s1 == s3 ); Console.WriteLine( "ReferenceEquals(s1,s3): {0}", ReferenceEquals( s1, s3 ) ); Console.WriteLine( " s2 = "Hello"; " ); s1 = "Hello"; Console.WriteLine( "s1: {0} s2: {1}", s1, s2 ); Console.WriteLine( "s1 == s2? {0}", s1 == s2 ); Console.WriteLine( "ReferenceEquals(s1,s2): {0}", ReferenceEquals( s1, s2 ) ); } static void Main() { Tester t = new Tester(); t.Run(); } } }
The output looks like this:
string s1: abcd string s2 = s1; s1: abcd s2: abcd s1 == s2? True ReferenceEquals(s1,s2): True string s2 = string.Copy( s1 ); s1: abcd s3: abcd s1 == s3? True ReferenceEquals(s1,s3): False s1 = "Hello"; s1: Hello s2: abcd s1 == s2? False ReferenceEquals(s1,s2): False
In Example 15-3, you start by initializing one string:
string s1 = "abcd";
You then assign the value of s1
to s2
using the assignment operator:
s2 = s1;
You print their values, as shown in the first section of
results, and find that not only do the two string references have the
same value, as indicated by using the equality operator (==
), but they actually point to the same
object in memory, which is why ReferenceEquals
returns true.
On the other hand, if you create s3
and assign its value using String.Copy(s1)
, while the two values are
equal (as shown by using the equality operator), they refer to
different objects in memory (as shown by the fact that ReferenceEquals
returns false).
Now, returning to s1
and
s2
, which refer to the same object,
if you change either one, for example, when you write:
s1 = "Hello";
s3
goes on referring to the
original string, but s1
now refers
to a brand new string.
If you later write:
S3 = "Goodbye";
(not shown in the example), the original string referred to by
s1
will no longer have any
references to it, and it will be mercifully and painlessly destoryed
by the Garbage Collector.
The .NET String
class
provides three ways to test for the equality of two strings. First,
you can use the overloaded Equals( )
method and ask one string (say, s6) directly whether
another string (s5) is of equal value:
Console.WriteLine( " Does s6.Equals(s5)?: {0}", s6.Equals(s5));
You can also pass both strings to String
’s static method Equals( )
:
Console.WriteLine( "Does Equals(s6,s5)?: {0}" string.Equals(s6,s5));
Or you can use the String
class’s overloaded equality operator (==
):
Console.WriteLine( "Does s6==s5?: {0}", s6 == s5);
In each of these cases, the returned result is a Boolean value (true for equal and false for unequal). Example 15-4 demonstrates these techniques.
Example 15-4. Are all strings created equal?
using System; namespace StringManipulation { class Tester { public void Run( ) { string s1 = "abcd"; string s2 = "ABCD"; // the string copy method string s5 = string.Copy( s2 ); Console.WriteLine( "s5 copied from s2: {0}", s5 ); string s6 = s5; Console.WriteLine( "s6 = s5: {0}", s6 ); // member method Console.WriteLine( " Does s6.Equals(s5)?: {0}", s6.Equals( s5 ) ); // static method Console.WriteLine( "Does Equals(s6,s5)?: {0}", string.Equals( s6, s5 ) ); // overloaded operator Console.WriteLine( "Does s6==s5?: {0}", s6 == s5 ); } static void Main( ) { Tester t = new Tester( ); t.Run( ); } } }
The output looks like this:
s5 copied from s2: ABCD s6 = s5: ABCD Does s6.Equals(s5)?: True Does Equals(s6,s5)?: True Does s6==s5?: True
The equality operator is the most natural of the three methods to use when you have two string objects.
The String
class
includes a number of useful methods and properties for finding
specific characters or substrings within a string, as well as for
manipulating the contents of the string. Example 15-5 demonstrates a
few methods, such as locating substrings, finding the index of a
substring, and inserting text from one string into another. Following
the output is a complete analysis.
Example 15-5. Useful methods of the String class
using System; namespace StringManipulation { class Tester { public void Run( ) { string s1 = "abcd"; string s2 = "ABCD"; string s3 = @"Liberty Associates, Inc. provides custom .NET development, on-site Training and Consulting"; // the string copy method string s5 = string.Copy( s2 ); Console.WriteLine( "s5 copied from s2: {0}", s5 ); // Two useful properties: the index and the length Console.WriteLine( " String s3 is {0} characters long. ", s5.Length ); Console.WriteLine( "The 5th character is {0} ", s3[4] ); // test whether a string ends with a set of characters Console.WriteLine( "s3:{0} Ends with Training?: {1} ", s3, s3.EndsWith( "Training" ) ); Console.WriteLine( "Ends with Consulting?: {0}", s3.EndsWith( "Consulting" ) ); // return the index of the substring Console.WriteLine( " The first occurrence of Training " ); Console.WriteLine( "in s3 is {0} ", s3.IndexOf( "Training" ) ); // insert the word excellent before "training" string s10 = s3.Insert( 73, "excellent " ); Console.WriteLine( "s10: {0} ", s10 ); // you can combine the two as follows: string s11 = s3.Insert( s3.IndexOf( "Training" ), "excellent " ); Console.WriteLine( "s11: {0} ", s11 ); } static void Main( ) { Tester t = new Tester( ); t.Run( ); } } }
The output looks like this:
s5 copied from s2: ABCD String s3 is 4 characters long. The 5th character is r s3:Liberty Associates, Inc. provides custom .NET development, on-site Training and Consulting Ends with Training?: False Ends with Consulting?: True The first occurrence of Training in s3 is 73 s10: Liberty Associates, Inc. provides custom .NET development, on-site excellent Training and Consulting s11: Liberty Associates, Inc. provides custom .NET development, on-site excellent Training and Consulting
The Length
property returns the length of the entire string, and
the index operator ([]
) is used to
access a particular character within a string:
Console.WriteLine( " String s3 is {0} characters long. ", s5.Length); Console.WriteLine( "The 5th character is {0} ", s3[4]);
Here’s the output:
String s3 is 4 characters long. The 5th character is r
The EndsWith( )
method asks a
string whether a substring is found at the end of the string. Thus,
you might first ask if s3
ends with
“Training” (which it does not), and then if it ends with “Consulting”
(which it does):
Console.WriteLine("s3:{0} Ends with Training?: {1} ", s3, s3.EndsWith("Training") ); Console.WriteLine( "Ends with Consulting?: {0}", s3.EndsWith("Consulting"));
The output reflects that the first test fails and the second succeeds:
Ends with Training?: False Ends with Consulting?: True
The IndexOf( )
method locates
a substring within a string, and the Insert( )
method inserts a new substring into a copy of the original
string. The following code locates the first occurrence of “Training”
in s3
:
Console.WriteLine(" The first occurrence of Training "); Console.WriteLine ("in s3 is {0} ", s3.IndexOf("Training"));
The output indicates that the offset is 73:
The first occurrence of Training in s3 is 73
Then use that value to insert the word “excellent,” followed by
a space, into that string. Actually the insertion is into a copy of
the string returned by the Insert( )
method and assigned to s10
:
string s10 = s3.Insert(73,"excellent "); Console.WriteLine("s10: {0} ",s10);
Here’s the output:
s10: Liberty Associates, Inc. provides custom .NET development, on-site excellent Training and Consulting
Finally, you can combine these operations to make a more efficient insertion statement:
string s11 = s3.Insert(s3.IndexOf("Training"),"excellent "); Console.WriteLine("s11: {0} ",s11);
with the identical result:
s11: Liberty Associates, Inc. provides custom .NET development, on-site excellent Training and Consulting
The String
class has
methods for finding and extracting substrings . For example, the IndexOf( )
method returns the index of the first
occurrence of a string (or of any character in an array of characters)
within a target string. For example, given the definition of the
string s1
as:
string s1 = "One Two Three Four";
you can find the first instance of the characters “hre” by writing:
int index = s1.IndexOf("hre");
This code sets the int
variable index
to 9, which is the
offset of the letters “hre” in the string s1
.
Similarly, the LastIndexOf( )
method returns the index of the last occurrence
of a string or substring. While the following code:
s1.IndexOf("o");
returns the value 6 (the first occurrence of the lowercase letter “o” is at the end of the word “Two”), the method call:
s1.LastIndexOf("o");
returns the value 15 (the last occurrence of “o” is in the word “Four”).
The Substring( )
method
returns a series of characters. You can ask it for all the characters
starting at a particular offset and ending either with the end of the
string or with an offset you (optionally) provide. Example 15-6 illustrates the
Substring( )
method.
Example 15-6. Finding substrings by index
using System; namespace StringSearch { class Tester { public void Run( ) { // create some strings to work with string s1 = "One Two Three Four"; int index; // get the index of the last space index = s1.LastIndexOf( " " ); // get the last word. string s2 = s1.Substring( index + 1 ); // set s1 to the substring starting at 0 // and ending at index (the start of the last word) // thus s1 has "one two three" s1 = s1.Substring( 0, index ); // find the last space in s1 (after two) index = s1.LastIndexOf( " " ); // set s3 to the substring starting at // index, the space after "two" plus one more // thus s3 = "three" string s3 = s1.Substring( index + 1 ); // reset s1 to the substring starting at 0 // and ending at index, thus the string "one two" s1 = s1.Substring( 0, index ); // reset index to the space between // "one" and "two" index = s1.LastIndexOf( " " ); // set s4 to the substring starting one // space after index, thus the substring "two" string s4 = s1.Substring( index + 1 ); // reset s1 to the substring starting at 0 // and ending at index, thus "one" s1 = s1.Substring( 0, index ); // set index to the last space, but there is // none so index now = -1 index = s1.LastIndexOf( " " ); // set s5 to the substring at one past // the last space. there was no last space // so this sets s5 to the substring starting // at zero string s5 = s1.Substring( index + 1 ); Console.WriteLine( "s2: {0} s3: {1}", s2, s3 ); Console.WriteLine( "s4: {0} s5: {1} ", s4, s5 ); Console.WriteLine( "s1: {0} ", s1 ); } static void Main( ) { Tester t = new Tester( ); t.Run( ); } } }
The output looks like this:
s2: Four s3: Three s4: Two s5: One s1: One
Example 15-6 is
not the most elegant solution possible to the problem of extracting
words from a string, but it is a good first approximation, and it
illustrates a useful technique. The example begins by creating a
string, s1
:
string s1 = "One Two Three Four";
The local variable index
is
assigned the value of the last literal space in the string (which
comes before the word “Four”):
index=s1.LastIndexOf(" ");
The substring that begins one position later is assigned to the
new string, s2
:
string s2 = s1.Substring(index+1);
This extracts the characters from index +1 to the end of the
line (the string “Four”) and assigns the value “Four” to s2
.
The next step is to remove the word “Four” from s1
; assign to s1
the substring of s1
that begins at 0 and ends at the
index:
s1 = s1.Substring(0,index);
After this line executes, the variable s1
will point to a new string object that
will contain the appropriate substring of the string that s1
used to point to. That original string
will eventually be destroyed by the garbage collector because no
variable now references it.
You reassign index to the last (remaining) space, which points
you to the beginning of the word “Three.” You then extract the
character “Three” into string s3
.
Continue like this until you’ve populated s4
and s5
. Finally, display the results:
s2: Four s3: Three s4: Two s5: One s1: One
A more effective solution to the problem illustrated in
Example 15-6 would be
to use the String
class’s Split( )
method, which parses a string into
substrings. To use Split( )
, pass
in an array of delimiters (characters that indicate where to divide
the words). The method returns an array of substrings (which Example 15-7 illustrates).
The complete analysis follows the code.
Example 15-7. The Split( ) method
using System; namespace StringSearch { class Tester { public void Run( ) { // create some strings to work with string s1 = "One,Two,Three Liberty Associates, Inc."; // constants for the space and comma characters const char Space = ' '; const char Comma = ','; // array of delimiters to split the sentence with char[] delimiters = new char[] { Space, Comma }; int ctr = 1; // split the string and then iterate over the // resulting array of strings String[] resultArray = s1.Split( delimiters ); foreach ( String subString in resultArray ) { Console.WriteLine(ctr++ + ":" + subString); } } static void Main( ) { Tester t = new Tester( ); t.Run( ); } } }
The output looks like this:
1: One 2: Two 3: Three 4: Liberty 5: Associates 6: 7: Inc.
Example 15-7 starts by creating a string to parse:
string s1 = "One,Two,Three Liberty Associates, Inc.";
The delimiters are set to the space and comma characters. Then
call Split( )
on the string,
passing in the delimiters:
String[] resultArray = s1.Split(delimiters);
Split( )
returns an array of
the substrings that you can then iterate over using the foreach
loop, as explained in Chapter 10:
foreach (String subString in resultArray)
You can, of course, combine the call to split with the iteration, as in the following:
foreach (string subString in s1.Split(delimiters))
C# programmers are fond of combining statements like this. The
advantage of splitting the statement into two, however, and of using an
interim variable like resultArray
is that you can examine the contents of resultArray
in the debugger.
Start the foreach
loop by
initializing output to an empty string, and then build up the output
string in four steps. Start by concatenating the incremented value of
ctr
to the output string, using the
+=
operator.
output += ctr++;
Next add the colon, then the substring returned by Split( )
, and then the newline:
output += ": "; output += subString; output += " ";
With each concatenation, a new copy of the string is made, and
all four steps are repeated for each substring found by Split( )
.
This repeated copying of string is terribly inefficient. The
problem is that the string type is not designed for this kind of
operation. What you want is to create a new string by appending a
formatted string each time through the loop. The class you need is
StringBuilder
.
You can use the System.Text.StringBuilder
class for creating
and modifying strings . Table
15-2 summarizes the important members of StringBuilder
.
Table 15-2. StringBuilder members
Method or property | Explanation |
---|---|
| Overloaded public method that
appends a typed object to the end of the current |
| Overloaded public method that replaces format specifiers with the formatted value of an object |
| Ensures that the current |
| Property that retrieves or assigns
the number of characters the |
| Overloaded public method that inserts an object at the specified position |
| Property that retrieves or assigns
the length of the |
| Property that retrieves the maximum
capacity of the |
| Removes the specified range of characters |
| Overloaded public method that replaces all instances of the specified characters with new characters |
Unlike String
, StringBuilder
is mutable; when you modify an
instance of the StringBuilder
class, you modify the actual string, not a copy.
Example 15-8
replaces the String
object in Example 15-7 with a StringBuilder
object.
Example 15-8. The StringBuilder class
using System; using System.Text; namespace StringSearch { class Tester { public void Run( ) { // create some strings to work with string s1 = "One,Two,Three Liberty Associates, Inc."; // constants for the space and comma characters const char Space = ' '; const char Comma = ','; // array of delimiters to split the sentence with char[] delimiters = new char[] { Space, Comma }; // use a StringBuilder class to build the // output string StringBuilder output = new StringBuilder( ); int ctr = 1; // split the string and then iterate over the // resulting array of strings foreach ( string subString in s1.Split( delimiters ) ) { // AppendFormat appends a formatted string output.AppendFormat( "{0}: {1} ", ctr++, subString ); } Console.WriteLine( output ); } static void Main( ) { Tester t = new Tester( ); t.Run( ); } } }
Only the last part of the program is modified. Rather than using
the concatenation operator to modify the string, use the AppendFormat( )
method of StringBuilder
to append new formatted
strings as you create them. This is much easier and far more
efficient. The output is identical:
1: One 2: Two 3: Three 4: Liberty 5: Associates 6: 7: Inc.
Because you passed in delimiters of both comma and space, the
space after the comma between “Associates” and “Inc.” is returned as a
word, numbered 6 in the previous code. That is not what you want. To
eliminate this, you need to tell Split( )
to match a comma (as between One, Two, and Three), a space
(as between Liberty and Associates), or a comma followed by a space.
It is that last bit that is tricky and requires that you use a
regular expression.