In an electrical system, a circuit breaker detects electrical problems and opens the circuit, which blocks electricity from flowing. To get electricity flowing again, you have to close the circuit. The same approach can be implemented in software when you’re sending requests to an external service.
This is especially important when you’re sending lots of concurrent requests. Without the circuit breaker, you can quickly run into big problems (such as port exhaustion if you’re using HttpClient).
To implement the circuit breaker pattern, you have to detect error conditions that indicate the service is temporarily down and then trip the circuit. You have to keep the circuit open for a short period of time to block request attempts. Then you have to carefully determine when it’s safe to close the circuit to let requests go through again.
This is similar to the retry pattern. The difference is the circuit breaker pattern applies to all requests while retries apply to individual requests.
Just like with retries, you can use the Polly .NET library to implement the circuit breaker pattern. It abstracts away the details so you can focus on your own code. This simplifies things quite a bit. When you use Polly, you configure a policy object, and then use it to execute your code.
Here’s a simple example of configuring a Polly circuit breaker policy and executing code with it:
var circuitBreakerPolicy = Policy.Handle<TransientException>()
.CircuitBreaker(exceptionsAllowedBeforeBreaking: 3, durationOfBreak: TimeSpan.FromSeconds(10));
while (true)
{
try
{
circuitBreakerPolicy.Execute(() =>
{
SendRequest();
Log("Successfully sent request");
});
return;
}
catch(BrokenCircuitException)
{
Log("The circuit breaker tripped and is temporarily disallowing requests. Will wait before trying again");
await Task.Delay(TimeSpan.FromSeconds(15));
}
catch (TransientException)
{
Log("Transient exception while sending request. Will try again.");
}
}
Code language: C# (cs)
This tells Polly to trip the circuit for 10 seconds when it sees three TransientExceptions in a row.
Running this code outputs the following:
11:52:36.66007 Transient exception while sending request. Will try again.
11:52:36.67443 Transient exception while sending request. Will try again.
11:52:36.67645 Transient exception while sending request. Will try again.
11:52:36.67716 The circuit breaker tripped and is temporarily disallowing requests. Will wait before trying again
11:52:51.70436 Successfully sent request
Code language: plaintext (plaintext)
The TransientException was thrown three times in a row, so it tripped the circuit and kept it open for 10 seconds. The fourth request was allowed through (because it was sent after the circuit was no longer open) and succeeded.
In this article, I’ll go into more details about how the Polly circuit breaker policy works. At the end, I’ll show a full example of using the Polly circuit breaker with HttpClient.
Note: For more advanced error detection that uses sampling, use the AdvancedCircuitBreaker policy.
Table of Contents
Install Polly
If you haven’t already, install the Polly nuget package by executing this command (this is using View > Other Windows > Package Manager Console):
Install-Package Polly
Code language: PowerShell (powershell)
Circuit states
There are three main circuit states: Closed, Open, and Half-Open. These can be summarized in the following table:
State | What it means |
Closed | The circuit is allowing requests through. Just like a closed circuit allows electricity to flow through. |
Open | The circuit tripped and isn’t allowing requests through right now. Just like an open circuit prevents electricity from flowing through. |
HalfOpen | The next request that comes through will be used to test the service, while all other requests will be rejected. If the test request succeeds, the circuit will close. Otherwise it will open again for the configured duration. |
Note: There’s another state called “Isolated”. It’s only used when you manually trip the circuit.
Log circuit state changes
You can log circuit state changes by using the onBreak, onReset, and onHalfOpen callbacks, like this:
var circuitBreakerPolicy = Policy.Handle<TransientException>()
.CircuitBreaker(exceptionsAllowedBeforeBreaking: 3, durationOfBreak: TimeSpan.FromSeconds(10),
onBreak: (_, duration) => Log($"Circuit open for duration {duration}"),
onReset: () => Log("Circuit closed and is allowing requests through"),
onHalfOpen: () => Log("Circuit is half-opened and will test the service with the next request"));
Code language: C# (cs)
Note: You can do anything in these callbacks, not just logging. I’m showing a logging example because this is a good way to learn about when these callbacks are fired.
Run the request in a loop, logging the circuit state before the request is attempted.
Log("Sending request");
Log($"CircuitState: {circuitBreakerPolicy.CircuitState}");
circuitBreakerPolicy.Execute(() =>
{
SendRequest();
Log("Successfully sent request");
});
Code language: C# (cs)
Note: For brevity, the error handling, additional logging, and delaying logic aren’t shown here.
The circuit is closed for the first three requests. The third request causes it to reach the error threshold and it trips the circuit. When this happens, the onBreak callback is executed:
01:48:00.74850 Sending request
01:48:00.76498 CircuitState: Closed
01:48:00.77115 Transient exception while sending request. Will try again.
01:48:00.77133 Sending request
01:48:00.77150 CircuitState: Closed
01:48:00.77171 Transient exception while sending request. Will try again.
01:48:00.77190 Sending request
01:48:00.77202 CircuitState: Closed
01:48:00.77463 onBreak: Circuit open for duration 00:00:10
01:48:00.77487 Transient exception while sending request. Will try again.
Code language: plaintext (plaintext)
The circuit is now open, and when the fourth request is executed, it throws a BrokenCircuitException:
01:48:00.77498 Sending request
01:48:00.77543 CircuitState: Open
01:48:00.77614 The circuit breaker tripped and is temporarily disallowing requests. Will wait before trying again
Code language: plaintext (plaintext)
The circuit breaker was configured to be open for 10 seconds. The request loop is waiting 15 seconds. After that, the fifth request is sent:
01:48:15.79555 Sending request
01:48:15.79615 onHalfOpen: Circuit is half-opened and will test the service with the next request
01:48:15.79633 CircuitState: HalfOpen
01:48:15.79676 Successfully sent request
01:48:15.79770 onReset: Circuit closed and is allowing requests through
Code language: plaintext (plaintext)
Notice the onHalfOpen callback wasn’t executed until the circuitBreakerPolicy object was interacted with. Logically, it was in the open state for 10 seconds and then in the half-open state. The onHalfOpen callback should’ve fired after 10 seconds, but it didn’t. This reveals that you shouldn’t rely on these callbacks for detecting state changes in real-time.
In the half-open state, it tests the service with the first request and blocks all other requests. Since the request was successful, it closed the circuit, resulting it in firing the onReset callback.
An open circuit doesn’t automatically close after the duration
Let’s say you have the following circuit breaker policy:
var circuitBreakerPolicy = Policy.Handle<HttpRequestException>()
.CircuitBreaker(exceptionsAllowedBeforeBreaking: 3, durationOfBreak: TimeSpan.FromSeconds(10));
Code language: C# (cs)
After it runs into three HttpRequestExceptions in a row, the circuit breaker will trip, opening the circuit for 10 seconds and blocking all requests that come in during that time.
After 10 seconds, it transitions to the half-open state. The first request that comes in during this state is used to test if it’s ok to close the circuit. If it succeeds, the circuit transitions to the closed state. If it fails, the circuit will be opened again for the configured duration. Meanwhile, any other requests that come in while it’s in the half-open state will run into the BrokenCircuitException.
This behavior makes sense. You don’t want to send tons of requests to an endpoint that’s potentially still down. This is especially true if you have no other throttling mechanism in place.
The exception count resets when there is a successful request
Let’s say you have the following circuit breaker policy that trips if it runs into three TransientExceptions in a row:
var circuitBreakerPolicy = Policy.Handle<TransientException>()
.CircuitBreaker(exceptionsAllowedBeforeBreaking: 3, durationOfBreak: TimeSpan.FromSeconds(10));
Code language: C# (cs)
What happens if a TransientException happens and then a successful request is sent? It resets the error count.
For example, let’s say you send six requests and it’s successful every other time:
12:46:20.92701 Transient exception while sending request. Will try again.
12:46:20.92723 Successfully sent request
12:46:21.93395 Transient exception while sending request. Will try again.
12:46:21.93409 Successfully sent request
12:46:22.94494 Transient exception while sending request. Will try again.
12:46:22.94534 Successfully sent request
Code language: plaintext (plaintext)
If it wasn’t resetting the error count, then the third TransientException would’ve tripped the circuit, and the request right after it would’ve failed.
It’s a good thing it resets the error count. Imagine if it didn’t do this. It would result in tripping the circuit when the service was in a known good state (and potentially several hours later from the time the first exception happened).
Manually change the circuit state
You can manually control the circuit state, closing or opening it as desired. There are many reasons why you might want to do this. Perhaps you know the endpoint is repaired and you want to immediately close the circuit to allow requests through again. Or maybe you’ve built-in an admin kill switch that’ll trip the circuit on demand.
Close the circuit with policy.Reset()
To manually close the circuit, call policy.Reset().
For example, let’s say you don’t like the half-open state functionality so you want to bypass it. Here’s how you’d do that:
if (circuitBreakerPolicy.CircuitState == CircuitState.HalfOpen)
{
circuitBreakerPolicy.Reset();
}
Code language: C# (cs)
Note: This also resets the error count.
Open the circuit with policy.Isolate()
To manually open the circuit to block requests, call policy.Isolate(). When you do this, it won’t close automatically. You have to call policy.Reset() to take it out of this isolated state. For example, let’s say you’ve built an admin control panel with pause / resume functionality:
Log("Admin is pausing requests");
circuitBreakerPolicy.Isolate();
Log("Admin is resuming requests");
circuitBreakerPolicy.Reset();
Code language: C# (cs)
Isolate() puts the circuit in the isolated state, which means it’s open and can only be closed again by calling Reset().
You can check if it’s in the isolated state by checking the CircuitState property:
catch(BrokenCircuitException)
{
if(circuitBreakerPolicy.CircuitState == CircuitState.Isolated)
{
Log("Circuit was intentionally tripped by the admin. Will try again after requests are resumed.");
}
}
Code language: C# (cs)
Note: You may want to handle BrokenCircuitException differently if you’re in isolated mode, since you know the circuit was intentionally opened.
Full example – Using circuit breaker with HttpClient
In this section, I’ll show a full example of using the Polly circuit breaker by using it with HttpClient to send requests to a service.
To simulate the service being temporarily unavailable, I’ve implemented a service stub that returns HTTP Status Code 404 (NotFound) when you tell it to. The client sends requests to this service and has configured the circuit breaker policy to look for this specific error code.
RandomNumberClient – Sends requests with HttpClient
First, here’s the client. This uses HttpClient to send requests to the service stub.
It configures the circuit breaker policy to look for three 404’s in a row and then trip for 1 minute. It’s wired up all the callback parameters (onBreak, onReset, and onHalfOpen) to log when they happen.
using Polly;
using Polly.CircuitBreaker;
public class RandomNumberClient
{
private readonly HttpClient HttpClient;
private readonly string GetRandomNumberUrl;
private readonly AsyncCircuitBreakerPolicy CircuitBreakerPolicy;
public RandomNumberClient(string url)
{
GetRandomNumberUrl = $"{url}/RandomNumber/";
HttpClient = new HttpClient();
CircuitBreakerPolicy = Policy.Handle<HttpRequestException>(httpEx => httpEx.StatusCode == HttpStatusCode.NotFound)
.CircuitBreakerAsync(
exceptionsAllowedBeforeBreaking: 3,
durationOfBreak: TimeSpan.FromMinutes(1),
onBreak: (_, duration) => Log($"Circuit tripped. Circuit is open and requests won't be allowed through for duration={duration}"),
onReset: () => Log("Circuit closed. Requests are now allowed through"),
onHalfOpen: () => Log("Circuit is now half-opened and will test the service with the next request"));
}
public async Task<string> GetRandomNumber()
{
try
{
return await CircuitBreakerPolicy.ExecuteAsync(async () =>
{
var response = await HttpClient.GetAsync(GetRandomNumberUrl);
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
});
}
catch(HttpRequestException httpEx)
{
Log($"Request failed. StatusCode={httpEx.StatusCode} Message={httpEx.Message}");
return "Failed";
}
catch(BrokenCircuitException ex)
{
Log($"Request failed due to opened circuit: {ex.Message}");
return "CircuitBroke";
}
}
private void Log(string message)
{
Console.WriteLine($"{DateTime.Now:hh:mm:ss.fffff}\t{message}");
}
}
Code language: C# (cs)
RandomNumberService – Returns errors when you tell it to
Here’s a snippet of the service stub. This is an alternative approach to using a tool like toxiproxy to simulate service problems.
[ApiController]
[Route("[controller]")]
public class RandomNumberController : ControllerBase
{
public static Mode Mode { get; set; } = Mode.Return200Ok;
[HttpGet()]
public ActionResult<string> Get()
{
Console.WriteLine($"Request received: GET /RandomNumber. Mode={Mode}");
if (Mode == Mode.Return200Ok)
return Ok(new Random().Next());
return NotFound();
}
}
public enum Mode
{
Return200Ok,
Return404NotFound
}
Code language: C# (cs)
Results
Start the RandomNumberService.
Stubbed endpoint: GET https://localhost:12345/RandomNumber
Commands:
set-mode Return200Ok
set-mode Return404NotFound
Current mode: Return200Ok
Code language: plaintext (plaintext)
Start the RandomNumberClient console app and send a request.
Press any key to send request
01:03:43.74248 Requesting random number
01:03:44.00662 Response: 1970545597
Code language: plaintext (plaintext)
Change the service mode to return errors.
set-mode Return404NotFound
Current mode: Return404NotFound
Code language: plaintext (plaintext)
Send multiple requests until the circuit trips and the circuit opens.
01:07:10.88731 Request failed. StatusCode=NotFound Message=Response status code does not indicate success: 404 (Not Found).
01:07:10.88760 Response: Failed
01:07:17.24384 Requesting random number
01:07:17.25567 Request failed. StatusCode=NotFound Message=Response status code does not indicate success: 404 (Not Found).
01:07:17.25588 Response: Failed
01:07:18.10956 Requesting random number
01:07:18.11535 Circuit tripped. Circuit is open and requests won't be allowed through for duration=00:01:00
01:07:18.11568 Request failed. StatusCode=NotFound Message=Response status code does not indicate success: 404 (Not Found).
01:07:18.11587 Response: Failed
Code language: plaintext (plaintext)
Send another request while the circuit is still open.
01:08:14.91007 Requesting random number
01:08:14.91141 Request failed due to opened circuit: The circuit is now open and is not allowing calls.
01:08:14.91155 Response: CircuitBroke
Code language: plaintext (plaintext)
The request is blocked because the circuit is open. It immediately throws a BrokenCircuitException.
After 1 minute, send another request. This time the circuit will be in the half-open state. It’ll use the request to test the service to determine if it should be fully closed or opened:
01:10:12.55587 Requesting random number
01:10:12.55633 Circuit is now half-opened and will test the service with the next request
01:10:12.56626 Circuit tripped. Circuit is open and requests won't be allowed through for duration=00:01:00
01:10:12.56657 Request failed. StatusCode=NotFound Message=Response status code does not indicate success: 404 (Not Found).
01:10:12.56671 Response: Failed
Code language: plaintext (plaintext)
This request failed because the service is still in error mode. Because the request failed in half-opened mode, the circuit will be opened again and we’ll have to wait another minute.
Change the service mode to stop returning errors:
set-mode Return200Ok
Current mode: Return200Ok
Code language: plaintext (plaintext)
After 1 minute, send another request.
01:15:47.46013 Requesting random number
01:15:47.46052 Circuit is now half-opened and will test the service with the next request
01:15:47.47420 Circuit closed. Requests are now allowed through
01:15:47.47439 Response: 723841862
Code language: plaintext (plaintext)
It was in a half-opened state, so it used the request to test the service. The request was successful, so it fully closed the circuit, allowing future requests through.
Send a few more requests to see that they are allowed through.
01:18:12.82052 Requesting random number
01:18:12.83118 Response: 961049677
01:18:13.34879 Requesting random number
01:18:13.35227 Response: 280453765
Code language: plaintext (plaintext)