C# – Deserialize JSON as a stream

Here’s an example of deserializing a JSON file as a stream with System.Text.Json:

using System.Text.Json; using var fileStream = new FileStream(@"D:\business.json", FileMode.Open, FileAccess.Read); //async version var business = await JsonSerializer.DeserializeAsync<Business>(fileStream); //sync version var business = JsonSerializer.Deserialize<Business>(fileStream)
Code language: C# (cs)

Stream deserialization has three main benefits:

  • It’s memory-efficient, which improves overall performance.
  • Fail fast when there’s a problem in the JSON data.
  • Deserialization process can be canceled (async version only).

In this article, I’ll go into details about these benefits and show a few other stream serialization scenarios.

Benefits of deserializing as a stream

Performance

There are two ways to deserialize JSON:

  • Read it into a string, and then deserialize it.
  • Deserialize it as a stream.

Deserializing a stream uses far less memory. This is because it doesn’t need to allocate a big string object. To show the difference, I deserialized a 9 MB file and benchmarked the two approaches. Here are the results:

| Method | Mean | StdDev | Memory | |----------- |---------:|---------:|----------:| | Stream | 114.4 ms | 1.00 ms | 9 MB | | String | 119.0 ms | 7.19 ms | 54 MB
Code language: plaintext (plaintext)

The stream deserialization approach used far less memory. This memory-efficiency makes a big difference in overall performance.

Fail fast

Deserializing as a stream allows you to detect errors as soon as possible and fail fast.

Here’s an example. Let’s say you have a JSON file with 100,000 objects and the 10,000th object has corrupt data that will cause the whole deserialization process to fail:

... { "Id": 9999, "Language": "JavaScript", "YearsExperience": 17 }, { "Id": 10000, "Language": "C#", "YearsExperience": "Bad data!" }, { "Id": 10001, "Language": "Java", "YearsExperience": 14 }, ...
Code language: JSON / JSON with Comments (json)

During deserialization, it will throw the following exception:

System.Text.Json.JsonException: The JSON value could not be converted to System.Int32. Path: $.Coders[10000].YearsExperience | LineNumber: 50005 | BytePositionInLine: 36.

In this example, deserializing as a stream results in it throwing an exception 4x sooner and it allocates 50x less memory. Failing as soon as possible is always good.

Can be canceled

DeserializeAsync() accepts a CancellationToken, allowing you to cancel the potentially long-running deserialization process. Here’s an example of limiting the deserialization to 10 ms:

using var fileStream = new FileStream(@"D:\business.json", FileMode.Open, FileAccess.Read); var timeoutAfter = TimeSpan.FromMilliseconds(10); using var cancellationTokenSource = new CancellationTokenSource(timeoutAfter); var business = await JsonSerializer.DeserializeAsync<Business>(fileStream, cancellationToken: cancellationTokenSource.Token);
Code language: C# (cs)

After 10 ms, it will throw a TaskCanceledException.

Note: If you have a UI, you can use a CancellationToken to let the user trigger cancellation. That leads to a good user experience.

Get objects as they are deserialized from a JSON array

When you’re deserializing a JSON array, and don’t need to keep all of the objects in memory, use DeserializeAsyncEnumerable().

Here’s an example of how this can be used. Let’s say you have a JSON array with lots of Coder objects:

[ { "Id": 0, "Language": "C#", "YearsExperience": 3 }, { "Id": 1, "Language": "Java", "YearsExperience": 1 }, ... { "Id": 99999, "Language": "JavaScript", "YearsExperience": 15 } ]
Code language: JSON / JSON with Comments (json)

Here’s an example of using DeserializeAsyncEnumerable() to get one Coder object at a time without keeping all of the Coder objects in memory:

using System.Text.Json; using var fileStream = new FileStream(@"D:\coders.json", FileMode.Open, FileAccess.Read); await foreach (var coder in JsonSerializer.DeserializeAsyncEnumerable<Coder>(fileStream)) { ReviewCode(coder); }
Code language: C# (cs)

Reading from a stream is already memory-efficient. Using DeserializeAsyncEnumerable() takes it to the next level. It’s extremely memory-efficient and a good choice if you don’t need to keep all of the deserialized objects around.

Note: You can also use a CancellationToken with this method.

Deserializing as a stream with Newtonsoft

If you’re using Newtonsoft instead of System.Text.Json, here’s how you’d deserialize JSON as a stream:

using Newtonsoft.Json; using var fileReader = File.OpenText(@"D:\business.json"); using var jsonReader = new JsonTextReader(fileReader); var serializer = new JsonSerializer(); var business = serializer.Deserialize<Business>(jsonReader);
Code language: C# (cs)

This is equivalent to using the synchronous System.Text.Json.JsonSerializer.Deserialize(stream).

Leave a Comment