Whenever you’re dealing with code that can run into transient errors, it’s a good idea to implement retries. Transient errors, by definition, are temporary and subsequent attempts should succeed. When you retry with a delay, it means you think the the transient error will go away by itself after a short period of time. When you retry without a delay, it means you’ll be changing something that should fix the problem so that the retries succeed.
The Polly .NET library helps simplify retries by abstracting away the retry logic, allowing you to focus on your own code. You can do retries with and without delays.
Here’s a simple example of using Polly to do retries with a delay. First you create a retry policy, and then you use it to execute the error prone code:
//Build the policy
var retryPolicy = Policy.Handle<TransientException>()
.WaitAndRetry(retryCount: 3, sleepDurationProvider: _ => TimeSpan.FromSeconds(1));
//Execute the error prone code with the policy
var attempt = 0;
retryPolicy.Execute(() =>
{
Log($"Attempt {++attempt}");
throw new TransientException();
});
Code language: C# (cs)
This retry policy means when an exception of type TransientException is caught, it will delay 1 second and then retry. It will retry up to 3 times.
Running this outputs the following:
03:22:26.56244 Attempt 1
03:22:27.58430 Attempt 2
03:22:28.58729 Attempt 3
03:22:29.59790 Attempt 4
Unhandled exception. TransientException: Exception of type 'TransientException' was thrown.
Code language: plaintext (plaintext)
Notice the last line. After the final attempt, it stopped retrying and let the exception bubble up.
In this article, I’ll go into more details about how to use Polly to do retries. At the end, I’ll show a full example of retrying HttpClient requests with Polly.
Table of Contents
Install Polly
If you haven’t already, install the Polly package by executing this command (this is using View > Other Windows > Package Manager Console):
Install-Package Polly
Code language: PowerShell (powershell)
After that, to use Polly, add the following using statement:
using Polly;
Code language: C# (cs)
Executing logic between retries with the onRetry parameter
The onRetry parameter allows you to pass in a lambda that will be executed between retries. There are many overloads that you can choose to implement. Use the one that makes the most sense in your scenario.
For example, let’s say you want to log retry information:
using Polly;
var MAX_RETRIES = 3;
//Build the policy
var retryPolicy = Policy.Handle<TransientException>()
.WaitAndRetry(retryCount: MAX_RETRIES, sleepDurationProvider: (attemptCount) => TimeSpan.FromSeconds(attemptCount * 2),
onRetry: (exception, sleepDuration, attemptNumber, context) =>
{
Log($"Transient error. Retrying in {sleepDuration}. {attemptNumber} / {MAX_RETRIES}");
});
//Execute the error prone code with the policy
retryPolicy.Execute(() =>
{
throw new TransientException();
});
Code language: C# (cs)
This outputs the following:
04:11:18.25781 Transient error. Retrying in 00:00:02. 1 / 3
04:11:20.28769 Transient error. Retrying in 00:00:04. 2 / 3
04:11:24.29990 Transient error. Retrying in 00:00:06. 3 / 3
Unhandled exception. RetriesWithPolly.TransientException: Exception of type 'RetriesWithPolly.TransientException' was thrown.
Code language: plaintext (plaintext)
Retry delay calculation
The sleepDurationProvider parameter allows you to pass in a lambda to control how long it’ll delay before doing a retry. Implement the retry delay calculation that makes the most sense in your situation.
This can be simple, like hardcoding a delay time:
_ => TimeSpan.FromSeconds(1)
Code language: C# (cs)
You can use the attempt count in the calculation, like this:
(attemptCount) => TimeSpan.FromSeconds(attemptCount * 2)
Code language: C# (cs)
The most complex calculation is the exponential backoff with jitter strategy (Note: This is implemented in the HttpClient example section below). This is useful if you have many concurrent requests because it spreads out retry attempts.
Retry without delay
You should only retry if the attempt has a chance of succeeding. Some transient errors can be fixed by delaying for a short time. Other errors may require you to do something to fix the problem so that the retry attempt will work.
You can use the onRetry method to try to fix the problem before the next retry attempt.
For example, let’s say you’re implementing an algorithm to calculate predictions and it’s prone to transient errors. On retry attempts, you want to change the parameters to reduce the chances of transient errors during the next retry attempt:
using Polly;
int attempt = 0;
int speed = 15;
int airIntake = 15;
//Build the policy
var retryPolicy = Policy.Handle<TransientException>()
.Retry(retryCount: 3,
onRetry: (exception, attemptNumber) =>
{
//Change something to try to fix the problem
speed -= 5;
airIntake -= 5;
});
//Execute the error prone code with the policy
retryPolicy.Execute(() =>
{
Log($"Attempt #{++attempt} - CalculationPredictions(speed: {speed}, airIntake: {airIntake})");
CalculatePredictions(speed, airIntake);
Log("Completed calculations");
});
Code language: C# (cs)
Note: The Fallback policy might have been a good option here, but the purpose of this is to show how to do retries without delaying.
This outputs the following:
Attempt #1 - CalculationPredictions(speed: 15, airIntake: 15)
Attempt #2 - CalculationPredictions(speed: 10, airIntake: 10)
Attempt #3 - CalculationPredictions(speed: 5, airIntake: 5)
Completed calculations
Code language: plaintext (plaintext)
Full example – Retrying HttpClient requests with Polly
With HTTP requests, it’s not a question of if you’ll run into transient errors, but when. It’s practically a guarantee that you’ll eventually run into some kind of transient error. Therefore it makes sense to be prepared and implement retry logic.
There are many possible transient errors when making HTTP requests (such as timeouts). In this section, I’ll only try to handle one type of problem: the Too Many Requests error status code (429). I’ll show the client and service (stubbed to return the error response) code below and the results of running it. In addition, I’ll show the exponential backoff with jitter calculator class. This class is passed into the client so it can be used as the sleepDurationProvider Polly parameter.
WeatherClient – Retries HttpClient requests with Polly
When sending concurrent requests with HttpClient, it’s a good idea to use the same instance repeatedly. The WeatherClient contains this single HttpClient instance.
In addition, it creates and contains the AsyncRetryPolicy (Note: You could pass it in instead).
Finally, it executes the requests with HttpClient within the retry policy block. It checks the response status code and throws an exception if it’s not OK. When the retry conditions are met (has status code 429), it retries the request.
using Polly;
using Polly.Retry;
public class WeatherClient
{
private readonly HttpClient httpClient;
private AsyncRetryPolicy retryPolicy;
public WeatherClient(IRetryDelayCalculator retryDelayCalculator)
{
httpClient = new HttpClient();
int MAX_RETRIES = 3;
retryPolicy = Policy.Handle<HttpRequestException>(ex => ex.StatusCode == HttpStatusCode.TooManyRequests)
.WaitAndRetryAsync(
retryCount: MAX_RETRIES,
sleepDurationProvider: retryDelayCalculator.Calculate,
onRetry: (exception, sleepDuration, attemptNumber, context) =>
{
Log($"Too many requests. Retrying in {sleepDuration}. {attemptNumber} / {MAX_RETRIES}");
});
}
private void Log(string message)
{
Console.WriteLine($"{DateTime.Now:hh:mm:ss.ffff} {message}");
}
public async Task<string> GetWeather()
{
return await retryPolicy.ExecuteAsync(async () =>
{
var response = await httpClient.GetAsync("https://localhost:12345/weatherforecast");
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
});
}
}
Code language: C# (cs)
Note: You may have noticed this is checking HttpRequestException.StatusCode. This property was added in .NET 5 (finally!).
WeatherService – A service stub that intentionally returns errors
For this example, I created a web API service stub that randomly returns the Too Many Requests (status code 429) error response.
[ApiController]
[Route("[controller]")]
public class WeatherForecastController : ControllerBase
{
private static readonly string[] Summaries = new[]
{
"Freezing", "Bracing", "Chilly", "Cool", "Mild", "Warm", "Balmy", "Hot", "Sweltering", "Scorching"
};
[HttpGet]
public IActionResult Get()
{
var rng = new Random();
if (rng.Next() % 3 == 0)
return StatusCode((int)HttpStatusCode.TooManyRequests);
return Ok(Summaries[rng.Next(Summaries.Length)]);
}
}
Code language: C# (cs)
Note: In addition to using service stubs, you can use toxiproxy for simulating problems when you want to test your resiliency logic.
Retry delay calculation: Exponential backoff with jitter
If there are going to be many concurrent requests, then it makes sense to use the exponential backoff with jitter strategy. This spreads out retry attempts so that you’re not sending all of the retry attempts at once. It reduces pressure on the server, which decreases the chances of running into transient errors.
The class below implements this calculation: (1 second * 2^attemptCount-1) + random jitter between 10-200ms.
public interface IRetryDelayCalculator
{
public TimeSpan Calculate(int attemptNumber);
}
public class ExponentialBackoffWithJitterCalculator : IRetryDelayCalculator
{
private readonly Random random;
private readonly object randomLock;
public ExponentialBackoffWithJitterCalculator()
{
random = new Random();
randomLock = new object();
}
public TimeSpan Calculate(int attemptNumber)
{
int jitter = 0;
lock (randomLock) //because Random is not threadsafe
jitter = random.Next(10, 200);
return TimeSpan.FromSeconds(Math.Pow(2, attemptNumber - 1)) + TimeSpan.FromMilliseconds(jitter);
}
}
Code language: C# (cs)
The following table shows the calculated delay ranges using the formula above:
Attempt # | Min delay | Max delay |
1 | 1.01 s | 1.2 s |
2 | 2.01 s | 2.2 s |
3 | 4.01 s | 4.2 s |
Note: The reason it needs a lock when calling Random.Next() is because Random isn’t threadsafe. There’s only one instance of Random, and there could be multiple threads making requests concurrently. Therefore, the call to Random.Next() has to be locked.
Results
To show the results, I executed the following code several times to produce different output:
try
{
var weatherClient = new WeatherClient(new ExponentialBackoffWithJitterCalculator());
Log($"Weather={await weatherClient.GetWeather()}");
}
catch(Exception ex)
{
Log($"Request failed. {ex.Message}");
}
Code language: C# (cs)
Sometimes the server will return errors on every request attempt, and it’ll error out after 3 retry attempts:
01:14:11.4251 Too many requests. Retrying in 00:00:01.1470000. 1 / 3
01:14:12.5897 Too many requests. Retrying in 00:00:02.0570000. 2 / 3
01:14:14.6547 Too many requests. Retrying in 00:00:04.1780000. 3 / 3
01:14:19.1047 Request failed. Response status code does not indicate success: 429 (Too Many Requests).
Code language: plaintext (plaintext)
Other times it’ll retry a few times and then succeed:
01:14:18.8450 Too many requests. Retrying in 00:00:01.0840000. 1 / 3
01:14:19.9461 Too many requests. Retrying in 00:00:02.0120000. 2 / 3
01:14:21.9674 Weather=Hot
Code language: plaintext (plaintext)
Note: I called WeatherClient.GetWeather() in a console app to produce these results.
Comments are closed.