You are creating an XML string. Before adding a tag containing a text element, you want to check it to determine whether the string contains any of the following invalid characters:
<
|
>
|
" |
' |
&
|
If any of these characters are encountered, you want them to be replaced with their escaped form:
<
|
>
|
"
|
'
|
&
|
There are different methods to accomplish this, depending on which
XML creation approach you are using. If you are using
XmlTextWriter
, the WriteCData
and WriteElementString
methods take care of this
for you. If you are using XmlDocument
and
XmlElements
, the
XmlElement.InnerXML
and
XmlElement.InnerText
methods will handle these
characters.
The two ways to handle this using an
XmlTextWriter
work like this. The
WriteCData
method will wrap the invalid character
text in a CDATA
section, as shown in the creation
of the InvalidChars1
element in the example that
follows. The other method, using
XmlTextWriter
, is to
use the WriteElementString
method that will
automatically escape the text for you, as shown while creating the
InvalidChars2
element:
// set up a string with our invalid chars string invalidChars = @"<>&'"; XmlTextWriter writer = new XmlTextWriter(Console.Out); writer.WriteStartElement("Root"); writer.WriteStartElement("InvalidChars1"); writer.WriteCData(invalidChars); writer.WriteEndElement( ); writer.WriteElementString("InvalidChars2",invalidChars); writer.WriteEndElement( ); writer.Close( );
The output from this is:
<Root> <InvalidChars1><![CDATA[<>&']]></InvalidChars1> <InvalidChars2><>&'</InvalidChars2> </Root>
The two
ways you can handle this problem with XmlDocument
and XmlElement
are as follows: the first way is to
surround the text you are adding to the XML element with a
CDATA
section, and add it to the
InnerXML
property of the
XmlElement
like this:
// set up a string with our invalid chars string invalidChars = @"<>&'"; XmlElement invalidElement1 = xmlDoc.CreateElement("InvalidChars1"); invalidElement1.InnerXml = "<![CDATA[" + invalidChars + "]]>";
The second way is to let the XmlElement
class
escape the data for you by assigning the text directly to the
InnerText
property like this:
// set up a string with our invalid chars string invalidChars = @"<>&'"; XmlElement invalidElement2 = xmlDoc.CreateElement("InvalidChars2"); invalidElement2.InnerText = invalidChars;
The whole XmlDocument
is created with these
XmlElements
in this
code:
public static void HandlingInvalidChars( ) { // set up a string with our invalid chars string invalidChars = @"<>&'"; XmlDocument xmlDoc = new XmlDocument( ); // create a root node for the document XmlElement root = xmlDoc.CreateElement("Root"); xmlDoc.AppendChild(root); // create the first invalid character node XmlElement invalidElement1 = xmlDoc.CreateElement("InvalidChars1"); // wrap the invalid chars in a CDATA section and use the // InnerXML property to assign the value as it doesn't // escape the values, just passes in the text provided invalidElement1.InnerXml = "<![CDATA[" + invalidChars + "]]>"; // append the element to the root node root.AppendChild(invalidElement1); // create the second invalid character node XmlElement invalidElement2 = xmlDoc.CreateElement("InvalidChars2"); // Add the invalid chars directly using the InnerText // property to assign the value as it will automatically // escape the values invalidElement2.InnerText = invalidChars; // append the element to the root node root.AppendChild(invalidElement2); Console.WriteLine("Generated XML with Invalid Chars: {0}",xmlDoc.OuterXml); Console.WriteLine( ); }
The XML created by this procedure (and output to the console) looks like this:
<Root> <InvalidChars1><![CDATA[<>&']]></InvalidChars1> <InvalidChars2><>&'</InvalidChars2> </Root>
One of the more
interesting types of nodes is the CDATA
type of
node. A CDATA
node allows you to represent the
items in the text section as character data, not as escaped XML, for
ease of entry. Normally these characters would need to be in their
escaped format (<
for
<
and so on) but the CDATA
section allows us to enter them as regular text.
When the CDATA
tag is
used in conjunction with the InnerXML
property of
the XmlElement
class, you can submit characters
that would normally need to be escaped first. The
XmlElement
class also has an
InnerText
property that will automatically escape
any markup found in the string assigned. This allows you to add these
characters without having to worry about them.