C# – Deserialize JSON as a stream

Here’s an example of deserializing JSON from a file as a stream with System.Text.Json:

using System.Text.Json;

using var fileStream = new FileStream(@"D:\business.json", FileMode.Open, FileAccess.Read);

//async version
var business = await JsonSerializer.DeserializeAsync<Business>(fileStream);

//sync version
var business = JsonSerializer.Deserialize<Business>(fileStream)
Code language: C# (cs)

Stream deserialization has three main benefits:

  • It’s memory-efficient, which improves overall performance.
  • Fail fast when there’s a problem in the JSON data.
  • Deserialization process can be canceled (async version only).

In this article, I’ll go into details about these benefits and show a few other stream serialization scenarios.

Benefits of deserializing as a stream

Performance

There are two ways to deserialize JSON:

  • Read it into a string, and then deserialize it.
  • Deserialize it as a stream.

Deserializing a stream uses far less memory. This is because it doesn’t need to allocate a big string object. I deserialized a 9 MB file and benchmarked the two approaches to compare the performance. Here are the results:

|     Method |     Mean |   StdDev | Memory    |
|----------- |---------:|---------:|----------:|
| Stream     | 114.4 ms | 1.00 ms  |      9 MB |
| String     | 119.0 ms | 7.19 ms  |     54 MB Code language: plaintext (plaintext)

The stream deserialization approach used far less memory. This memory-efficiency makes a big difference in overall performance.

Fail fast

Deserializing as a stream allows you to detect errors as soon as possible and fail fast.

Here’s an example. Let’s say you have a JSON file with 100,000 objects and the 10,000th object has corrupt data that will cause the whole deserialization process to fail:

...
{
  "Id": 9999,
  "Language": "JavaScript",
  "YearsExperience": 17
},
{
  "Id": 10000,
  "Language": "C#",
  "YearsExperience": "Bad data!"
},
{
  "Id": 10001,
  "Language": "Java",
  "YearsExperience": 14
},
...
Code language: JSON / JSON with Comments (json)

During deserialization, it will throw the following exception:

System.Text.Json.JsonException: The JSON value could not be converted to System.Int32. Path: $.Coders[10000].YearsExperience | LineNumber: 50005 | BytePositionInLine: 36.

In this example, deserializing as a stream results in it throwing an exception 4x sooner and it allocates 50x less memory. Failing as soon as possible is always good.

Can be canceled

DeserializeAsync() accepts a CancellationToken, allowing you to cancel the potentially long-running deserialization process. Here’s an example of limiting the deserialization to 10 ms:

using var fileStream = new FileStream(@"D:\business.json", FileMode.Open, FileAccess.Read);

var timeoutAfter = TimeSpan.FromMilliseconds(10);
using var cancellationTokenSource = new CancellationTokenSource(timeoutAfter);

var business = await JsonSerializer.DeserializeAsync<Business>(fileStream,
    cancellationToken: cancellationTokenSource.Token);
Code language: C# (cs)

After 10 ms, it will throw a TaskCanceledException.

Note: If you have a UI, you can use a CancellationToken to let the user trigger cancellation. That leads to a good user experience.

Get objects as they are deserialized from a JSON array

When you’re deserializing a JSON array, and don’t need to keep all of the objects in memory, use DeserializeAsyncEnumerable().

Here’s an example of how this can be used. Let’s say you have a JSON array with lots of Coder objects:

[
  {
    "Id": 0,
    "Language": "C#",
    "YearsExperience": 3
  },
  {
    "Id": 1,
    "Language": "Java",
    "YearsExperience": 1
  },
  ...
    {
    "Id": 99999,
    "Language": "JavaScript",
    "YearsExperience": 15
  }
]Code language: JSON / JSON with Comments (json)

Here’s an example of using DeserializeAsyncEnumerable() to get one Coder object at a time without keeping all of the Coder objects in memory:

using System.Text.Json;

using var fileStream = new FileStream(@"D:\coders.json", FileMode.Open, FileAccess.Read);

await foreach (var coder in JsonSerializer.DeserializeAsyncEnumerable<Coder>(fileStream))
{
    ReviewCode(coder);
}
Code language: C# (cs)

Reading from a stream is already memory-efficient. Using DeserializeAsyncEnumerable() takes it to the next level. It’s extremely memory-efficient and a good choice if you don’t need to keep all of the deserialized objects around.

Note: You can also use a CancellationToken with this method.

Deserializing as a stream with Newtonsoft

If you’re using Newtonsoft instead of System.Text.Json, here’s how you’d deserialize JSON as a stream:

using Newtonsoft.Json;

using var fileReader = File.OpenText(@"D:\business.json");
using var jsonReader = new JsonTextReader(fileReader);

var serializer = new JsonSerializer();

var business = serializer.Deserialize<Business>(jsonReader);
Code language: C# (cs)

This is equivalent to using the synchronous System.Text.Json.JsonSerializer.Deserialize(stream).

Leave a Comment