This chapter introduces serialization and deserialization, the mechanism by which objects can be represented in a flat-text or binary form. Unless otherwise stated, the types in this chapter all exist in the following namespaces:
System.Runtime.Serialization System.Xml.Serialization System.Text.Json
We cover the data contract serializer in an online supplement.
Serialization is the act of taking an in-memory object or object graph (set of objects that reference one another) and flattening it into a stream of bytes, XML, JSON, or a similar representation that can be stored or transmitted. Deserialization works in reverse, taking a data stream and reconstituting it into an in-memory object or object graph.
Serialization and deserialization are typically used to do the following:
Transmit objects across a network or application boundary
Store representations of objects within a file or database
Another, less common use is to deep-clone objects. You also can use the data contract and XML serialization engines as general-purpose tools for loading and saving XML files of a known structure, whereas the JSON serializer can do the same for JSON files.
.NET Core supports serialization and deserialization both from the perspective of clients wanting to serialize and deserialize objects, and from the perspective of types wanting some control over how they are serialized.
There are four serialization engines in .NET Core:
XmlSerializer
(XML)
JsonSerializer
(JSON)
The (somewhat redundant) data contract serializer (XML and JSON)
The binary serializer (binary)
If you’re serializing to XML, you can choose between XmlSerializer
and the data contract serializer. XmlSerializer
offers greater flexibility on how the XML is structured, whereas the data contract serializer has the unique ability to preserve shared object references.
If you’re serializing to JSON, you also have a choice. JsonSerializer
offers the best performance, whereas the data contract serializer has a few extra features due to its longer heritage. However, if you need extra features, a better choice is likely to be the third-party Json.NET library.
If you need to interoperate with legacy SOAP-based web services, the data contract serializer is the best choice.
And if you don’t care about the format, the binary serialization engine is the most powerful and easiest to use. The output, however, is not human-readable and it’s less version-tolerant than the other serializers.
Table 17-1 compares each of the engines. More stars equate to a better score.
Feature | XmlSerializer | JsonSerializer | Data contract serializer | Binary serializer |
---|---|---|---|---|
Level of automation | **** | ***** | *** | ***** |
Output | XML | JSON | XML or JSON | Binary |
Type coupling | Loose | Loose | Loose | Tight |
Version tolerance | ***** | ***** | ***** | *** |
Can deserialize subtypes | With help | No | With help | Yes |
Preserves object references | No | No | With XML | Yes |
Can serialize nonpublic fields | No | No | Yes | Yes |
Suitable for interoperable messaging | Yes | Yes | Yes | No |
Flexibility in output format | **** | *** | ** | - |
Compact output | ** | *** | ** | **** |
Performance | * to *** | **** | *** | *** |
Note that the XML serialization engine requires that you recycle the same XmlSerializer
object for good performance.
The reason for there being four engines is partly historical. The .NET Framework originally started out with two distinct goals in serialization:
Serializing .NET object graphs with full type and reference fidelity
Interoperating with XML and SOAP messaging standards
The first led to the binary serializer (which was used by .NET Remoting); the second led to the XmlSerializer
(which was used by ASMX web services).
With the release of Windows Communication Foundation (WCF) in 2006, a new serialization engine was required—the data contract serializer—and it was hoped that the new engine could largely replace the older two. However, because its design focused heavily on features relevant to interoperable messaging, it never fully achieved this goal, and the two older engines remained useful.
WCF was designed to be format-neutral, but in practice it was shaped by needs of complex SOAP protocols, which later lost popularity in favor of REST and JSON. This led, at first, to Microsoft adding JSON support to the data contract serializer, but eventually to the demise of WCF and its exclusion from .NET Core 3. The data contract serializer remains in .NET Core, although the exclusion of WCF has diminished its role, as has Microsoft’s addition of JsonSerializer
to .NET Core 3. It’s expected that JsonSerializer
will be enhanced in future .NET Core releases, further reducing the role of the data contract serializer.
The XML serialization engine can produce only XML, and it is less powerful than the binary and data contract serializers in saving and restoring a complex object graph (it cannot restore shared object references). It’s the most flexible of the four, however, in following an arbitrary output structure. For instance, you can choose whether properties are serialized to elements or attributes and the handling of a collection’s outer element. The XML engine also provides excellent version tolerance. XmlSerializer
was used by the legacy ASMX web services.
The JSON serializer is fast and efficient, and was introduced relatively recently to .NET Core. It also offers good version tolerance and allows the use of custom converters for flexibility. JsonSerializer
is used by ASP.NET Core 3, removing the dependency on Json.NET, though it is straightforward to opt back in to Json.NET should its features be required.
The data contract serializer supports a data contract model that helps you decouple the low-level details of the types you want to serialize from the structure of the serialized data. This provides excellent version tolerance, meaning you can deserialize data that was serialized from an earlier or later version of a type. You can even deserialize types that have been renamed or moved to a different assembly.
The data contract serializer can cope with most object graphs, although it can require more assistance than the binary serializer. You also can use it as a general-purpose tool for reading/writing XML files, if you’re flexible on how the XML is structured. (If you need to store data in attributes or cope with XML elements presenting in an arbitrary order, you cannot use the data contract serializer.)
We cover the data contract serializer in an online supplement.
The binary serialization engine is easy to use, highly automatic, and well supported throughout .NET Core 3 (and even more so in .NET Framework). Quite often, a single attribute is all that’s required to make a complex type fully serializable. The binary serializer is also faster than the data contract serializer when full type fidelity is needed. However, it tightly couples a type’s internal structure to the format of the serialized data, resulting in poor version tolerance (although it can tolerate the simple addition of a field). The binary engine emits only binary data; it cannot produce XML or JSON in .NET Core. (In .NET Framework, there’s a formatter for SOAP-based messaging that provides limited XML support.)
For complex XML serialization tasks, you can implement IXmlSerializable
and do the serialization yourself with an XmlReader
and XmlWriter
. The IXmlSerializable
interface is recognized both by XmlSerializer
and by the data contract serializer, so you can use it selectively to handle the more complicated types. We describe XmlReader
and XmlWriter
in detail in Chapter 11.
The output of the data contract and binary serializers is shaped by a pluggable formatter. The role of a formatter is the same with both serialization engines, although they use completely different classes to do the job.
A formatter shapes the final presentation to suit a particular medium or context of serialization. In .NET Core, the data contract serializer lets you choose between XML and JSON formatters, and in .NET Framework you can also choose a binary formatter. A binary formatter is designed to work in a context for which an arbitrary stream of bytes will do—typically a file/stream or proprietary messaging packet. Binary output is usually smaller than XML or JSON.
The binary serializer offers only a binary formatter in .NET Core (in .NET Framework, there’s also a SOAP formatter for XML-based messaging).
Serialization and deserialization can be initiated in two ways.
The first is explicitly, by requesting that a particular object be serialized or deserialized. When you serialize or deserialize explicitly, you choose both the serialization engine and the formatter.
In contrast, implicit serialization is initiated by .NET. This happens when:
A serializer recursively serializes a child object.
You use a feature that relies on serialization, such as Web API.
Web API can work with either XML or JSON serialization.
Implicit serialization is less prevalent in .NET Core than in .NET Framework, which includes WCF (implicitly using the data contract serializer), Remoting (implicitly using the binary serialization engine), and ASMX Web Services (implicitly using XmlSerializer
).
The XmlSerializer
class in the System.Xml.Serialization
namespace serializes and deserializes based on attributes in your classes.
To use XmlSerializer
, you instantiate it and call Serialize
or Deserialize
with a Stream
and object instance. To illustrate, suppose we define the following class:
public class Person { public string Name; public int Age; }
The following saves a Person
to an XML file and then restores it:
Person p = new Person(); p.Name = "Stacey"; p.Age = 30; var xs = new XmlSerializer (typeof (Person)); using (Stream s = File.Create ("person.xml")) xs.Serialize (s, p); Person p2; using (Stream s = File.OpenRead ("person.xml")) p2 = (Person) xs.Deserialize (s); Console.WriteLine (p2.Name + " " + p2.Age); // Stacey 30
Serialize
and Deserialize
can work with a Stream
, XmlWriter
/XmlReader
, or TextWriter
/TextReader
. Here’s the resultant XML:
<?xml version="1.0"?> <Person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Name>Stacey</Name> <Age>30</Age> </Person>
XmlSerializer
can serialize types without any attributes—such as our Person
type. By default, it serializes all public fields and properties on a type. You can exclude members that you don’t want serialized by applying the XmlIgnore
attribute:
public class Person { ... [XmlIgnore] public DateTime DateOfBirth; }
XmlSerializer
relies on a parameterless constructor for deserialization, throwing an exception if one is not present. (In our example, Person
has an implicit parameterless constructor.) This also means that field initializers execute prior to deserialization:
public class Person { public bool Valid = true; // Executes before deserialization }
Although XmlSerializer
can serialize almost any type, it recognizes the following types and treats them specially:
The primitive types, DateTime
, TimeSpan
, Guid
, and nullable versions
byte[]
(which is converted to base 64)
An XmlAttribute
or XmlElement
(whose contents are injected into the stream)
Any type implementing IXmlSerializable
Any collection type
The deserializer is version tolerant: it doesn’t complain if elements or attributes are missing or if superfluous data is present.
By default, fields and properties serialize to an XML element. You can request an XML attribute be used, instead, as follows:
[XmlAttribute] public int Age;
You can control an element or attribute’s name as follows:
public class Person { [XmlElement ("FirstName")] public string Name; [XmlAttribute ("RoughAge")] public int Age; }
Here’s the result:
<Person RoughAge="30" ...> <FirstName>Stacey</FirstName> </Person>
The default XML namespace is blank. To specify an XML namespace, [XmlElement]
and [XmlAttribute]
both accept a Namespace
argument. You can also assign a name and namespace to the type itself with [XmlRoot]
:
[XmlRoot ("Candidate", Namespace = "http://mynamespace/test/")] public class Person { ... }
This names the person
element “Candidate” as well as assigning a namespace to this element and its children.
XmlSerializer
writes elements in the order in which they’re defined in the class. You can change this by specifying an Order
in the XmlElement
attribute:
public class Person { [XmlElement (Order = 2)] public string Name; [XmlElement (Order = 1)] public int Age; }
If you use Order
at all, you must use it throughout.
The deserializer is not fussy about the order of elements—they can appear in any sequence and the type will properly deserialize.
Suppose that your root type has two subclasses, as follows:
public class Person { public string Name; } public class Student : Person { } public class Teacher : Person { }
and you want to write a reusable method to serialize the root type:
public void SerializePerson (Person p, string path) { XmlSerializer xs = new XmlSerializer (typeof (Person)); using (Stream s = File.Create (path)) xs.Serialize (s, p); }
To make this method work with a Student
or Teacher
, you must inform XmlSerializer
about the subclasses. There are two ways to do this. The first is to register each subclass by applying the XmlInclude
attribute:
[XmlInclude (typeof (Student))] [XmlInclude (typeof (Teacher))] public class Person { public string Name; }
The second is to specify each of the subtypes when constructing XmlSerializer
:
XmlSerializer xs = new XmlSerializer (typeof (Person), new Type[] { typeof (Student), typeof (Teacher) } );
In either case, the serializer responds by recording the subtype in the type
attribute:
<Person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="Student"> <Name>Stacey</Name> </Person>
This deserializer then knows from this attribute to instantiate a Student
and not a Person
.
You can control the name that appears in the XML type
attribute by applying [XmlType]
to the subclass:
[XmlType ("Candidate")] public class Student : Person { }
Here’s the result:
<Person xmlns:xsi="..." xsi:type="Candidate">
XmlSerializer
automatically recurses object references such as the HomeAddress
field in Person
:
public class Person { public string Name; public Address HomeAddress = new Address(); } public class Address { public string Street, PostCode; }
To demonstrate:
Person p = new Person { Name = "Stacey" }; p.HomeAddress.Street = "Odo St"; p.HomeAddress.PostCode = "6020";
Here’s the XML to which this serializes:
<Person ... > <Name>Stacey</Name> <HomeAddress> <Street>Odo St</Street> <PostCode>6020</PostCode> </HomeAddress> </Person>
If you have two fields or properties that refer to the same object, that object is serialized twice. If you need to preserve referential equality, you must use another serialization engine.
Suppose that you need to serialize a Person
that can reference subclasses of Address
, as follows:
public class Address { public string Street, PostCode; } public class USAddress : Address { } public class AUAddress : Address { } public class Person { public string Name; public Address HomeAddress = new USAddress(); }
There are two distinct ways to proceed, depending on how you want the XML structured. If you want the element name always to match the field or property name with the subtype recorded in a type
attribute:
<Person ...> ... <HomeAddress xsi:type="USAddress"> ... </HomeAddress> </Person>
you use [XmlInclude]
to register each of the subclasses with Address
, as follows:
[XmlInclude (typeof (AUAddress))] [XmlInclude (typeof (USAddress))] public class Address { public string Street, PostCode; }
If, on the other hand, you want the element name to reflect the name of the subtype, to the following effect:
<Person ...> ... <USAddress> ... </USAddress> </Person>
you instead stack multiple [XmlElement]
attributes onto the field or property in the parent type:
public class Person { public string Name; [XmlElement ("Address", typeof (Address))] [XmlElement ("AUAddress", typeof (AUAddress))] [XmlElement ("USAddress", typeof (USAddress))] public Address HomeAddress = new USAddress(); }
Each XmlElement
maps an element name to a type. If you take this approach, you don’t require the [XmlInclude]
attributes on the Address
type (although their presence doesn’t break serialization).
XmlSerializer
recognizes and serializes concrete collection types without intervention:
public class Person { public string Name; public List<Address> Addresses = new List<Address>(); } public class Address { public string Street, PostCode; }
Here’s the XML to which this serializes:
<Person ... > <Name>...</Name> <Addresses> <Address> <Street>...</Street> <Postcode>...</Postcode> </Address> <Address> <Street>...</Street> <Postcode>...</Postcode> </Address> ... </Addresses> </Person>
The [XmlArray]
attribute lets you rename the outer element (i.e., Addresses
).
The [XmlArrayItem]
attribute lets you rename the inner elements (i.e., the Address
elements).
For instance, the following class:
public class Person { public string Name; [XmlArray ("PreviousAddresses")] [XmlArrayItem ("Location")] public List<Address> Addresses = new List<Address>(); }
serializes to this:
<Person ... > <Name>...</Name> <PreviousAddresses> <Location> <Street>...</Street> <Postcode>...</Postcode> </Location> <Location> <Street>...</Street> <Postcode>...</Postcode> </Location> ... </PreviousAddresses> </Person>
The XmlArray
and XmlArrayItem
attributes also allow you to specify XML namespaces.
To serialize collections without the outer element, for example:
<Person ... > <Name>...</Name> <Address> <Street>...</Street> <Postcode>...</Postcode> </Address> <Address> <Street>...</Street> <Postcode>...</Postcode> </Address> </Person>
instead add [XmlElement]
to the collection field or property:
public class Person { ... [XmlElement ("Address")] public List<Address> Addresses = new List<Address>(); }
The rules for subclassing collection elements follow naturally from the other subclassing rules. To encode subclassed elements with the type
attribute, for example:
<Person ... > <Name>...</Name> <Addresses> <Address xsi:type="AUAddress"> ...
add [XmlInclude]
attributes to the base (Address
) type, as we did earlier. This works whether or not you suppress serialization of the outer element.
If you want subclassed elements to be named according to their type, for example:
<Person ... > <Name>...</Name> <!—start of optional outer element—> <AUAddress> <Street>...</Street> <Postcode>...</Postcode> </AUAddress> <USAddress> <Street>...</Street> <Postcode>...</Postcode> </USAddress> <!—end of optional outer element—> </Person>
you must stack multiple [XmlArrayItem]
or [XmlElement]
attributes onto the collection field or property.
Stack multiple [XmlArrayItem]
attributes if you want to include the outer collection element:
[XmlArrayItem ("Address", typeof (Address))] [XmlArrayItem ("AUAddress", typeof (AUAddress))] [XmlArrayItem ("USAddress", typeof (USAddress))] public List<Address> Addresses = new List<Address>();
Stack multiple [XmlElement]
attributes if you want to exclude the outer collection element:
[XmlElement ("Address", typeof (Address))] [XmlElement ("AUAddress", typeof (AUAddress))] [XmlElement ("USAddress", typeof (USAddress))] public List<Address> Addresses = new List<Address>();
Although attribute-based XML serialization is flexible, it has limitations. For instance, you cannot add serialization hooks—nor can you serialize nonpublic members. It’s also awkward to use if the XML might present the same element or attribute in a number of different ways.
On that last issue, you can push the boundaries somewhat by passing an XmlAttributeOverrides
object into XmlSerializer
’s constructor. There comes a point, however, when it’s easier to take an imperative approach. This is the job of IXmlSerializable
:
public interface IXmlSerializable { XmlSchema GetSchema(); void ReadXml (XmlReader reader); void WriteXml (XmlWriter writer); }
Implementing this interface gives you total control over the XML that’s read or written.
A collection class that implements IXmlSerializable
bypasses XmlSerializer
’s rules for serializing collections. This can be useful if you need to serialize a collection with a payload—in other words, additional fields or properties that would otherwise be ignored.
The rules for implementing IXmlSerializable
are as follows:
ReadXml
should read the outer start element, then the content, and then the outer end element.
WriteXml
should write just the content.
Here’s an example:
using System; using System.Xml; using System.Xml.Schema; using System.Xml.Serialization; public class Address : IXmlSerializable { public string Street, PostCode; public XmlSchema GetSchema() { return null; } public void ReadXml(XmlReader reader) { reader.ReadStartElement(); Street = reader.ReadElementContentAsString ("Street", ""); PostCode = reader.ReadElementContentAsString ("PostCode", ""); reader.ReadEndElement(); } public void WriteXml (XmlWriter writer) { writer.WriteElementString ("Street", Street); writer.WriteElementString ("PostCode", PostCode); } }
Serializing and deserializing an instance of Address
via XmlSerializer
automatically calls the WriteXml
and ReadXml
methods. Further, if Person
were defined like this:
public class Person { public string Name; public Address HomeAddress; }
IXmlSerializable
would be called upon selectively to serialize the HomeAddress
field.
We describe XmlReader
and XmlWriter
at length in the first section of Chapter 11. Also in Chapter 11, in “Patterns for Using XmlReader/XmlWriter” we provide examples of IXmlSerializable
-ready classes.
JsonSerializer
(in the System.Text.Json
namespace) is straightforward to use because of the simplicity of the JSON format. The root of a JSON document is either an array or an object. Under that root are properties, which can be an object, array, string, number, "true"
, "false"
, or "null"
. The JSON serializer directly maps class property names to property names in JSON.
Assuming Person
is defined like this:
public class Person { public string Name { get; set; } }
we can serialize it to a JSON string by calling JsonSerializer.Serialize
:
var p = new Person { Name = "Ian" }; string json = JsonSerializer.Serialize (p, new JsonSerializerOptions { WriteIndented = true });
Here is the result:
{ Name: "Ian" }
The JsonSerializer.Deserialize
method does the reverse, and deserializes:
Person p2 = JsonSerializer.Deserialize<Person> (json);
The JSON serializer ignores fields, and serializes only properties.
The JSON serializer requires that your properties have public get
and set
accessors, which means that it cannot deserialize immutable classes or structs whose properties are initialized through a constructor. This limitation might be relaxed in subsequent releases.
Suppose that we define Person
to have a home and work Address
:
public class Address { public string Street { get; set; } public string PostCode { get; set; } } public class Person { public string Name { get; set; } public Address HomeAddress { get; set; } public Address WorkAddress { get; set; } }
We can serialize this with no extra work:
var home = new Address { Street = "1 Main St.", PostCode="11235" }; var work = new Address { Street = "4 Elm Ln.", PostCode="31415" }; var p = new Person { Name = "Ian", HomeAddress = home, WorkAddress = work }; Console.WriteLine (JsonSerializer.Serialize (p, new JsonSerializerOptions { WriteIndented = true } ));
Upon encountering HomeAddress
and WorkAddress
, the serializer creates JSON objects:
{ "Name": "Ian", "HomeAddress": { "Street": "1 Main St.", "PostCode": "11235" }, "WorkAddress": { "Street": "4 Elm Ln.", "PostCode": "31415" } }
Note, though, what happens when we set HomeAddress
and WorkAddress
to the same object instance:
var p = new Person { Name = "Ian", HomeAddress = home, WorkAddress = home };
Here’s the output:
{ "Name": "Ian", "HomeAddress": { "Street": "1 Main St.", "PostCode": "11235" }, "WorkAddress": { "Street": "1 Main St.", "PostCode": "11235" } }
There is no information in the JSON to indicate that HomeAddress
and WorkAddress
were originally the same object instance. When deserialized, two separate instances of Address
will be created and assigned to the respective properties.
This also means that JsonSerializer
cannot handle cycles in the object graph. To illustrate, suppose that we add a Partner
property to our Person
class:
public class Person { ... public Person Partner { get; set; } }
The following throws a JsonException
because sara
and ian
contain a reference to each other:
var sara = new Person { Name = "Sara" }; var ian = new Person { Name = "Ian", Partner = sara }; sara.Partner = ian; string json = JsonSerializer.Serialize (ian); // throws
JsonSerializer
automatically serializes collections. Collections can appear in an object’s properties as well as in the root object itself. We can illustrate the latter by using the Person
and Address
classes that we defined at the beginning of the preceding section:
var sara = new Person { Name = "Sara" }; var ian = new Person { Name = "Ian" }; Console.WriteLine (JsonSerializer.Serialize (new[] { sara, ian }, new JsonSerializerOptions { WriteIndented = true }));
Here’s the result:
[ { "Name": "Sara" }, { "Name": "Ian" } ]
The following deserializes the JSON:
Person[] people = JsonSerializer.Deserialize<Person[]> (json);
It is possible to serialize a collection containing differently typed objects:
var sara = new Person { Name = "Sara" }; var addr = new Address { Street = "1 Main St.", PostCode = "11235" }; Console.WriteLine (JsonSerializer.Serialize (new object[] { sara, addr }, new JsonSerializerOptions { WriteIndented = true }));
This yields the following:
[ { "Name": "Sara" }, { "Street": "1 Main St.", "PostCode": "11235" } ]
Deserializing such collections is clumsy because the type of each element is not written into the JSON. You need to take the low-level approach of deserializing to JsonElement[]
and then enumerating each property:
var deserialized = JsonSerializer.Deserialize<JsonElement[]>(json); foreach (var element in deserialized) { foreach (var prop in element.EnumerateObject()) Console.WriteLine ($"{prop.Name}: {prop.Value}"); Console.WriteLine ("---"); } // Output: Name: Sara --- Street: 1 Main St. PostCode: 11235
We describe how to use JsonElement
in “JsonDocument”.
You can control the serialization process with attributes defined in the System.Text.Json.Serialization
namespace.
If the JSON property name differs from the C# property name, you can create a mapping with [JsonPropertyName]
. For example, if the JSON property name is "FullName"
, and the C# property name is Name
, we could create a mapping, as follows:
public class Person { [JsonPropertyName("FullName")] public string Name { get; set; } }
This serializes to the following:
{ "FullName":"...", }
Consider a web API that returns instances of a Person
class and a client that uses the API. Both are maintained by different organizations. If the API author adds a new property to the Person
class (such as Age
), the client is still able to deserialize the JSON with its old Person
class, because it will simply skip over the unknown Age
property. However, suppose that the client then updates its instance of Person
, serializes it, and sends it back to the API. The original Age
value is then lost.
To illustrate, we’ll have the web API define Person
as:
public class Person_// v2 { public int Id { get; set; } public string Name { get; set; } public int Age { get; set; } // New property }
which would generate JSON like this:
{ "Id": 27182, "Name": "Sara", "Age": 35 }
If we deserialize that JSON into an older version of the class (without the Age
property):
public class Person_// v1 { public int Id { get; set; } public string Name { get; set; } }
the age information has no place to go.
If we later serialize our version and send it back to the API, our JSON will not contain an Age
property, and the API will interpret Age
to be zero (the default value for an integer).
JsonExtensionDataAttribute
solves that problem by providing a mechanism to store all unrecognized properties so that their values can be used when reserializing. When the attribute is placed on a property of type IDictionary<string,TValue>
(TValue
must be object
or JsonElement
), the serializer uses that property to persist the unrecognized JSON properties; no information is lost:
public class Person { public int Id { get; set; } public string Name { get; set; } [JsonExtensionData] public IDictionary<string, JsonElement> Storage { get; set; } = new Dictionary<string, JsonElement>(); }
Suppose that you need to interoperate with an API provider that encodes dates with the Unix timestamp format (number of seconds since 1/1/1970):
{ "Id":27182, "Name":"Sara", "Born":464572800 // Number of seconds since 1/1/1970 }
We would like to deserialize this into a class that uses the .NET DateTime
class:
public class Person { public int Id { get; set; } public string Name { get; set; } public DateTime Born { get; set; } }
We can achieve this by writing a custom data converter:
public class UnixTimestampConverter : JsonConverter<DateTime> { public override DateTime Read (ref Utf8JsonReader reader, Type type, JsonSerializerOptions options) { if (reader.TryGetInt32(out int timestamp)) return new DateTime (1970, 1, 1).AddSeconds (timestamp); throw new Exception ("Expected the timestamp as a number."); } public override void Write (Utf8JsonWriter writer, DateTime value, JsonSerializerOptions options) { int timestamp = (int)(value - new DateTime(1970, 1, 1)).TotalSeconds; writer.WriteNumberValue(timestamp); } }
Then we can either apply the [JsonConverter]
to the properties that we want to convert:
[JsonConverter(typeof(UnixTimestampConverter))] public DateTime Born { get; set; }
or, if the API is consistent in its representation of data types, make the converter act as a default:
JsonSerializerOptions opts = new JsonSerializerOptions(); opts.Converters.Add (new UnixTimestampConverter()); var sara = JsonSerializer.Deserialize<Person> (json, opts);
The latter instructs the serializer to use UnixTimestampConverter
every time it encounters a DateTime
.
The serializer accepts an optional JsonSerializationOptions
parameter, allowing additional control over the serialization and deserialization process. The following subsections present the most useful options.
We have set WriteIndented
to true
throughout this section to instruct the serializer to emit whitespace to generate more human-readable JSON. The default is false
, which results in everything being crammed onto one line.
The JSON spec requires properties and array elements to be comma separated but does not allow trailing commas:
{ "Name":"Dylan", "LuckyNumbers": [10, 7, ], "Age":46, }
The trailing commas after 7 and 46 are not allowed by default. To enable them, do this:
var commaTolerant = JsonSerializer.Deserialize<Person> (brokenJson, new JsonSerializerOptions { AllowTrailingCommas = true });
By default, the deserializer throws an exception when encountering comments (because comments are not part of the official JSON standard). Setting ReadCommentHandling
to JsonCommentHandling.Skip
instructs the deserializer to skip over them instead, so the following can be successfully parsed:
{ "Name":"Dylan" // Comment here /* This is another comment */ }
By default, the deserializer is case sensitive when matching JSON property names to C# property names. This means that the following input:
{ "name":"Dylan" }
would fail to populate the Name
property in our Person
class (the JSON property would be ignored).
Setting PropertyNameCaseInsensitive
to true
solves this problem by instructing the deserializer to perform case-insensitive matching (at a small performance cost):
var dylan = JsonSerializer.Deserialize<Person> (json, new JsonSerializerOptions { PropertyNameCaseInsensitive = true });
If the input has predictable casing, another solution is to use the JsonPropertyName
attribute (described earlier) or the PropertyNamingPolicy
option (described next).
To better support the popular camel-case property naming convention, .NET Core 3 introduced PropertyNamingPolicy
. It provides better performance than the just-described PropertyNameCaseInsensitive
option and applies to both serialization and deserialization. Thus, the code:
var dylan = new Person { Name = "Dylan" }; var json = JsonSerializer.Serialize (dylan, new JsonSerializerOptions { PropertyNamingPolicy = JsonNamingPolicy.CamelCase });
yields:
{"name": "Dylan"}
which can be deserialized in the same way:
var dylan2 = JsonSerializer.Deserialize<Person> (json, new JsonSerializerOptions { PropertyNamingPolicy = JsonNamingPolicy.CamelCase });
With the DictionaryKeyPolicy
option, you can force dictionary keys to serialize or deserialize with camel casing:
var dict = new Dictionary<string, string> { { "BookName", "Nutshell" } { "BookVersion", "8.0" }, }; Console.WriteLine (JsonSerializer.Serialize (dict, new JsonSerializerOptions { WriteIndented = true, DictionaryKeyPolicy = JsonNamingPolicy.CamelCase }));
This outputs the following:
{ "bookName": "Nutshell" "bookVersion": "8.0", }
The default text encoder aggressively escapes characters such that the output can appear in an HTML document without additional processing:
string dylan = "<b>Dylan & Friends</b>"; Console.WriteLine (JsonSerializer.Serialize (dylan));
Here’s the output:
"u003Cbu003EDylan u0026 Friendsu003C/bu003E"
You can prevent this by changing the Encoder
:
Console.WriteLine (JsonSerializer.Serialize (dylan, new JsonSerializerOptions { Encoder = JavaScriptEncoder.UnsafeRelaxedJsonEscaping }));
This yields the following output:
"<b>Dylan & Friends</b>"
UnsafeRelaxedJsonEscaping
is a subclass of System.Text.Encodings.Web. JavaScriptEncoder
. Should the need arise, you can implement your own subclass for complete control over the encoding process.
By default, null
property values are included in the JSON output, so:
var person = new Person { Name = null };
would serialize to:
{ "Name": null }
With IgnoreNullValues
set to true
, null-value properties are completely ignored:
Console.WriteLine (JsonSerializer.Serialize (person), new JsonSerializerOptions { IgnoreNullValues = true } ));
Here’s the output:
{}
The binary serialization engine saves and restores objects with full type and reference fidelity, and you can use it to perform such tasks as saving and restoring objects to disk. The binary serializer is highly automated and can handle complex object graphs with minimum intervention. It’s not available, however, in Windows Store apps.
There are two ways to make a type support binary serialization. The first is attribute-based; the second involves implementing ISerializable
. Adding attributes is simpler; implementing ISerializable
is more flexible. You typically implement ISerializable
to do the following:
Dynamically control what gets serialized.
Make your serializable type friendly to being subclassed by other parties.
You can make a type serializable by applying a single attribute:
[Serializable] public sealed class Person { public string Name; public int Age; }
The [Serializable]
attribute instructs the serializer to include all fields in the type. This includes both private and public fields (but not properties). Every field must itself be serializable; otherwise, an exception is thrown. Primitive .NET types such as string
and int
support serialization (as do many other .NET types).
The Serializable
attribute is not inherited, so subclasses are not automatically serializable, unless also marked with this attribute.
To serialize an instance of Person
, you instantiate BinaryFormatter
(in System.Runtime.Serialization.Formatters.Binary
) and call Serialize
.
.NET Framework also offers a SoapFormatter
that you can use in the same way to generate SOAP-compatible XML output. It’s less functional than BinaryFormatter
and it neither supports generic types nor the filtering of extraneous data necessary for version-tolerant serialization.
The following serializes a Person
with a BinaryFormatter
:
Person p = new Person() { Name = "George", Age = 25 }; IFormatter formatter = new BinaryFormatter(); using (FileStream s = File.Create ("serialized.bin")) formatter.Serialize (s, p);
All of the data necessary to reconstruct the Person
object is written to the file serialized.bin. The Deserialize
method restores the object:
using (FileStream s = File.OpenRead ("serialized.bin")) { Person p2 = (Person) formatter.Deserialize (s); Console.WriteLine (p2.Name + " " + p2.Age); // George 25 }
The deserializer bypasses all constructors and field initializers when re-creating objects. Behind the scenes, it calls FormatterServices.GetUninitializedObject
to do this job. You can call this method yourself to implement some very grubby design patterns!
The serialized data includes full type and assembly information, so if we try to cast the result of deserialization to a matching Person
type in a different assembly, an error would result. The deserializer fully restores object references to their original state upon deserialization. This includes collections, which are just treated as serializable objects like any other (all collection types in System.Collections.*
are marked as serializable).
The binary engine can handle large, complex object graphs without special assistance (other than ensuring that all participating members are serializable). One thing to be wary of is that the serializer’s performance degrades in proportion to the number of references in your object graph. This can become an issue in a Remoting server that has to process many concurrent requests.
By default, all fields are serialized. Fields that you don’t want serialized, such as those used for temporary calculations or for storing file or window handles, you must mark explicitly with the [NonSerialized]
attribute:
[Serializable] public sealed class Person { public string Name; [NonSerialized] public int Age; }
This instructs the serializer to ignore the Age
member.
Nonserialized members are always empty or null
when deserialized—even if field initializers or constructors set them otherwise.
A method marked with the [OnDeserializing]
attribute fires just prior to deserialization and acts as a kind of constructor. This can be important because the binary deserializer bypasses all your normal constructors as well as field initializers.
In the following example, we define a field called Valid
, which we exclude from serialization with the [NonSerialized]
attribute:
public sealed class Person { public string Name; [NonSerialized] public bool Valid = true; public Person() => Valid = true; }
A deserialized Person
will never be Valid
—despite the constructor and field initializer both setting Valid
to true
. We can solve this by writing a special deserialization constructor as follows:
[OnDeserializing] void OnDeserializing (StreamingContext context) => Valid = true;
The [OnSerializing]
and [OnSerialized]
attributes mark methods for execution before or after serialization.
[OnSerializing]
is useful for populating a field that’s used only for serialization. To illustrate, suppose that you want to make the following class serializable:
class Foo { public XDocument Xml; }
The difficulty is that XDocument
(in the System.Xml.Linq
namespace) is not itself serializable. We can solve this by applying the [NonSerialized]
attribute to the Xml
field and then defining an [OnSerializing]
method that writes the content of the XDocument
to a string field (that we do serialize):
[Serializable] class Foo { [NonSerialized] public XDocument Xml; string _xmlString; // used only for serialization [OnSerializing] void OnSerializing (StreamingContext context) => _xmlString = Xml.ToString(); }
The final step is to reconstruct the XDocument
when deserializating. We can do this by adding an [OnDeserialized]
method:
[OnDeserialized] void OnDeserialized (StreamingContext context) => Xml = XDocument.Parse (_xmlString);
Adding or removing fields doesn’t break compatibility with already serialized data: the deserializer skips over data for which there’s no matching field. When adding a field, you can apply the following attribute to remind yourself that it might be absent from data serialized by an older version of the software:
[Serializable] public sealed class Person { public string Name; [OptionalField (VersionAdded = 2)] public DateTime DateOfBirth; }
This serves as documentation and has no effect on serialization semantics.
Implementing ISerializable
gives a type complete control over its binary serialization and deserialization.
Here’s the ISerializable
interface definition:
public interface ISerializable { void GetObjectData (SerializationInfo info, StreamingContext context); }
GetObjectData
fires upon serialization; its job is to populate the SerializationInfo
object (a name-value dictionary) with data from all fields that you want serialized. Here’s how we would write a GetObjectData
method that serializes two fields, called Name
and DateOfBirth
:
public virtual void GetObjectData (SerializationInfo info, StreamingContext context) { info.AddValue ("Name", Name); info.AddValue ("DateOfBirth", DateOfBirth); }
In this example, we’ve chosen to name each item according to its corresponding field. This is not required; you can use any name, but you must use the same name upon deserialization. The values themselves can be of any serializable type; the serialization will continue recursively as necessary. It’s legal to store null values in the dictionary.
It’s a good idea to make the GetObjectData
method virtual
—unless your class is sealed
. This allows subclasses to extend serialization without having to reimplement the interface.
SerializationInfo
also contains properties that you can use to control the type and assembly into which the instance should deserialize.
In addition to implementing ISerializable
, a type controlling its own serialization needs to provide a deserialization constructor that takes the same two parameters as GetObjectData
. The constructor can be declared with any accessibility and the runtime will still find it. Typically, though, you would declare it protected
so that subclasses can call it.
In the following example, we define Player
and Team
classes, following the principles of immutability (with everything read-only). But because the immutable collections are not serializable, we need to take control over the serialization process by implementing ISerializable
:
[Serializable] public class Player { public readonly string Name; public Player (string name) => Name = name; } [Serializable] public class Team : ISerializable { public readonly string Name; public readonly ImmutableList<Player> Players; // Not serializable! public Team (string name, params Player[] players) { Name = name; Players = players.ToImmutableList(); } // Serialize the object: public virtual void GetObjectData (SerializationInfo si, StreamingContext sc) { si.AddValue ("Name", Name); // Convert Players to an ordinary serializable array: si.AddValue ("PlayerData", Players.ToArray()); } // Deserialize the object: protected Team (SerializationInfo si, StreamingContext sc) { Name = si.GetString ("Name"); // Deserialize Players to an array to match our serialization: Player[] p = (Player[]) si.GetValue ("PlayerData", typeof (Player[])); // Construct a new immutable List using this array: Players = p.ToImmutableList(); } }
(You could also solve this problem by using the [OnSerializing]
and [OnDeserialized]
attributes that we discussed earlier.)
For commonly used types, the SerializationInfo
class has typed “Get” methods, such as GetString
, in order to make writing deserialization constructors easier. If you specify a name for which no data exists, an exception is thrown. This happens most often when there’s a version mismatch between the code doing the serialization and deserialization. You’ve added an extra field, for instance, and then forgotten about the implications of deserializing an old instance. To work around this problem, you can do either of the following:
Add exception handling around code that retrieves a data member added in a later version
Implement your own version numbering system; for example:
public string MyNewField; public virtual void GetObjectData (SerializationInfo si, StreamingContext sc) { si.AddValue ("_version", 2); si.AddValue ("MyNewField", MyNewField); ... } protected Team (SerializationInfo si, StreamingContext sc) { int version = si.GetInt32 ("_version"); if (version >= 2) MyNewField = si.GetString ("MyNewField"); ... }
In the preceding examples, we sealed
the classes that relied on attributes for serialization. To see why, consider the following class hierarchy:
[Serializable] public class Person { public string Name; public int Age; } [Serializable] public sealed class Student : Person { public string Course; }
In this example, both Person
and Student
are serializable, and both classes use the default runtime serialization behavior because neither class implements ISerializable
.
Now imagine that the developer of Person
decides for some reason to implement ISerializable
and provide a deserialization constructor to control Person
serialization. The new version of Person
might look like this:
[Serializable] public class Person : ISerializable { public string Name; public int Age; public virtual void GetObjectData (SerializationInfo si, StreamingContext sc) { si.AddValue ("Name", Name); si.AddValue ("Age", Age); } protected Person (SerializationInfo si, StreamingContext sc) { Name = si.GetString ("Name"); Age = si.GetInt32 ("Age"); } public Person() {} }
Although this works for instances of Person
, this change breaks serialization of Student
instances. Serializing a Student
instance would appear to succeed, but the Course
field in the Student
type isn’t saved to the stream because the implementation of ISerializable.GetObjectData
on Person
has no knowledge of the members of the Student
-derived type. Additionally, deserialization of Student
instances throws an exception because the runtime is looking (unsuccessfully) for a deserialization constructor on Student
.
The solution to this problem is to implement ISerializable
from the outset for serializable classes that are public and nonsealed. (With internal
classes, it’s not so important because you can easily modify the subclasses later if required.)
If we started out by writing Person
, as in the preceding example, Student
would then be written as follows:
[Serializable] public class Student : Person { public string Course; public override void GetObjectData (SerializationInfo si, StreamingContext sc) { base.GetObjectData (si, sc); si.AddValue ("Course", Course); } protected Student (SerializationInfo si, StreamingContext sc) : base (si, sc) { Course = si.GetString ("Course"); } public Student() {} }