When you have code that calls an endpoint, you need to make sure it’s resilient and can handle error scenarios, such as timeouts.
One way to prove your code is resilient is by using toxiproxy to simulate bad behavior. Toxiproxy sits between your client code and the endpoint. It receives requests from your client, applies toxic behavior to simulate error scenarios, and then forwards the request to the real endpoint.
In this article I’ll explain how to install and use toxiproxy to simulate two error scenarios:
- The request taking too long and causing a client-side timeout.
- The request failing due to the endpoint being unavailable.
I’ll start with client code that has no error handling and show how it fails in the error scenarios, and then show how to handle the errors.
Note: In this article I’ll be referring “C:/toxiproxy” as the install location, but you can put toxiproxy anywhere you want.
1 – Download toxiproxy client and server
- Go here: https://github.com/Shopify/toxiproxy/releases.
- Download the appropriate client and server for whatever OS you’re using.
- Put them in C:/toxiproxy
- Rename them to server.exe and client.exe.
In my case I’m using Windows 64-bit, and at the time of writing this, the latest version of toxiproxy was 2.1.4. So I grabbed the following two executables:
- toxiproxy-cli-windows-amd64.exe
- toxiproxy-server-windows-amd64.exe
2 – Configure toxiproxy to proxy requests to the real endpoint
- Create C:\toxiproxy\config.json
- Configure toxiproxy to work with your upstream endpoint. Let’s say you’re calling GET on a weather API running on 127.0.0.1:12345. In config.json, you would add the following:
[
{
"name":"weather",
"listen":"127.0.0.1:12001",
"upstream":"127.0.0.1:12345"
}
]
Code language: JSON / JSON with Comments (json)
Explanation of these settings:
Setting | Value | Explanation |
name | weather | How you’ll refer to this endpoint from the toxiproxy client. Use a short and simple name. |
listen | 127.0.0.1:12001 | This is the endpoint that toxiproxy listens for requests on. Note: Make sure the port is not blocked by the firewall. |
upstream | 127.0.0.1:12345 | This is the real endpoint. When toxiproxy receives requests on its listening endpoint, it forwards the requests to this upstream endpoint. |
3 – Run the toxiproxy server
From the command line, run server.exe and specify config.json.
./server -config config.json
Code language: Bash (bash)
Note: I’m using a bash terminal.
You should see the following output:
msg="Started proxy" name="weather" proxy="127.0.0.1:12001" upstream="127.0.0.1:12345"
msg="Populated proxies from file" config="config.json" proxies=1
msg="API HTTP server starting" host="localhost" port="8474" version="2.1.4"
Code language: plaintext (plaintext)
Troubleshooting common toxiproxy server errors
Error | Solution |
An attempt was made to access a socket in a way forbidden by its access permissions. | Something else is already using the listener port specified in config.json. Find an available port and update the listener port in config.json, then restart server.exe. |
listen tcp 127.0.0.1:8474 Only one usage of each socket address is normally permitted | Toxiproxy has a listener on port 8474 (to receive commands from the toxiproxy client). This means another instance of the toxiproxy server is already running and using port 8474. Just shut down the other instance. Note: It’s also possible that another program is using 8474. This would be bad, because it looks like toxiproxy has this port hardcoded. You’d have to take the toxiproxy source and recompile it with a different port in this case. |
If you’re seeing other strange behavior, such as traffic not getting through, make sure the firewall is not blocking you.
4 – Update the weather client to use the toxiproxy listener endpoint, then start the weather client
I have this very simple client code that polls the weather API every 5 seconds. I’ll be referring to this as the weather client (to distinguish it from the toxiproxy client). It has no error handling. Currently, it’s pointing to the real upstream endpoint at 127.0.0.1:12345.
I changed it to point to the toxiproxy listener endpoint at 127.0.0.1:12001.
HttpClient httpClient = new HttpClient()
{
Timeout = TimeSpan.FromSeconds(5)
};
while (true)
{
Log("Getting weather");
/*
* Pointing to the real upstream endpoint
var response = await httpClient.GetAsync("http://127.0.0.1:12345/weather");
*/
//Pointing to toxiproxy listener endpoint
var response = await httpClient.GetAsync("http://127.0.0.1:12001/weather");
var content = await response.Content.ReadAsStringAsync();
Log($"StatusCode={response.StatusCode} Weather={content}");
await Task.Delay(TimeSpan.FromSeconds(5));
}
Code language: C# (cs)
After changing the weather client to point to toxiproxy’s listener endpoint, start running the weather client.
At this point the weather client is going through toxiproxy and behaving normally. It’s polling the weather API every 5 seconds and showing this output:
08:10:24.435 Getting weather
08.10:24.438 StatusCode=OK Weather={"temperatureF":58,"description":"Sunny"}
08:10:29.446 Getting weather
08.10:29.450 StatusCode=OK Weather={"temperatureF":57,"description":"Sunny"}
Code language: plaintext (plaintext)
5 – Use the toxiproxy client to simulate the endpoint being unavailable
The following command turns off the toxiproxy weather listening endpoint:
./client toggle weather
Code language: Bash (bash)
Proxy weather is now disabled
Code language: plaintext (plaintext)
When the weather client tries to connect, it gets the following exception:
System.Net.Http.HttpRequestException: ‘No connection could be made because the target machine actively refused it.’
Inner Exception
SocketException: No connection could be made because the target machine actively refused it.
This crashes the weather client, because it has no error handling at all. Let’s fix that in the next step.
6 – Update the weather client to handle the unavailable endpoint scenario
To handle the unavailable endpoint error, we need to catch HttpRequestException and check its inner exception. It should be a SocketException with the ErrorCode = SocketError.ConnectionRefused (10061).
Next, we need to think of an error handling strategy. I’m going to use a simple error handling strategy:
- When the endpoint is unavailable, try the failover URL.
- When the failover URL is unavailable, shut down the weather client.
For production code, I’d suggest using Polly for retry logic (and for circuit breaker logic). But for this example, I’ll implement the logic manually. Make sure to use whatever error handling strategy makes sense in your situation.
HttpClient httpClient = new HttpClient()
{
Timeout = TimeSpan.FromSeconds(5)
};
bool failedOver = false;
//this is the toxiproxy url
string url = "http://127.0.0.1:12001/weather";
string failOverUrl = "http://127.0.0.1:12345/weather";
while (true)
{
try
{
Log("Getting weather");
var response = await httpClient.GetAsync(url);
var content = await response.Content.ReadAsStringAsync();
Log($"StatusCode={response.StatusCode} Weather={content}");
}
catch(HttpRequestException ex)
when (ex?.InnerException is SocketException se && se.ErrorCode == (int)SocketError.ConnectionRefused)
{
if (!failedOver)
{
Log("Endpoint is unavailable. Switching to failover url");
url = failOverUrl;
failedOver = true;
}
else
{
Log("Failover Url is unavailable. Shutting down!");
return;
}
}
await Task.Delay(TimeSpan.FromSeconds(5));
}
Code language: C# (cs)
Note: This is using ‘exception filtering’ to conditionally catch exceptions having a specific error code.
Now run the weather client again and look at the output:
09:10:00.726 Getting weather
09:10:02.816 Endpoint is unavailable. Switching to failover url
09:10:07.816 Getting weather
09:10:07.842 StatusCode=OK Weather={"temperatureF":50,"description":"Sunny"}
Code language: plaintext (plaintext)
It’s detecting the service unavailable scenario and using the failover URL to successfully get the weather.
This shows how convenient it is to use toxiproxy to simulate an endpoint unavailable scenario.
Note: This only shows one possible Error Code (10061 – Connection Refused). Make sure to think about other error codes that could happen and handle whatever ones make sense in your situation. Here is a reference to the different socket error codes you can run into: SocketError Enum.
7 – Re-enable the toxiproxy endpoint and restart the client
Before going to the next error scenarios, re-enable the weather endpoint by executing the following command:
./client toggle weather
Code language: Bash (bash)
You should see the following output:
Proxy weather is now enabled
Code language: plaintext (plaintext)
Now restart the weather client. It should be working normally again.
8 – Use the toxiproxy client to cause timeouts
In the weather client I have specified a 5 second timeout in the HttpClient constructor:
HttpClient httpClient = new HttpClient()
{
Timeout = TimeSpan.FromSeconds(5)
};
Code language: C# (cs)
This means the weather client will timeout if the request takes longer than 5 seconds.
To simulate a request taking a long time, we can use the toxiproxy client to add latency with the following command:
./client toxic add weather -t latency -a latency=6000
Code language: Bash (bash)
This will output:
Added downstream latency toxic 'latency_downstream' on proxy 'weather'
Code language: plaintext (plaintext)
Now make sure the weather client is running. When it makes a request, toxiproxy will make the request take 6 seconds, so it’ll timeout on the client-side and get the following exception:
System.Threading.Tasks.TaskCanceledException: ‘The operation was canceled.’
Let’s update the weather client to handle this exception and deal with the timeout scenario.
9 – Update the weather client to handle the timeout scenario
To handle timeouts coming from HttpClient, we need to catch TaskCanceledException and handle it appropriately. One common approach is to retry the request with a longer timeout. Of course, you’ll need to use the error handling strategy that makes sense for your situation.
I’m going to do a simple retry strategy:
- Start with a 5 second timeout.
- If a timeout happens, increase the timeout to 10 second for future requests.
To change the timeout, you can’t just change the HttpClient.Timeout property. That results in the following exception:
InvalidOperationException: This instance has already started one or more requests and can only be modified before sending the first request
It’s best practice to reuse the HttpClient object for all requests (instead of creating a new one for each request). This means we’ll need to use a CancellationTokenSource with a specified timeout and pass in a CancellationToken when making the request, like this:
int timeout = 5000;
int extraTimeout = 10_000;
HttpClient httpClient = new HttpClient();
bool failedOver = false;
//this is the toxiproxy url
string url = "http://127.0.0.1:12001/weather";
string failOverUrl = "http://127.0.0.1:12345/weather";
while (true)
{
try
{
Log("Getting weather");
var cancelToken = new CancellationTokenSource(timeout);
var response = await httpClient.GetAsync(url, cancelToken.Token);
var content = await response.Content.ReadAsStringAsync();
Log($"StatusCode={response.StatusCode} Weather={content}");
}
catch(HttpRequestException ex)
when (ex?.InnerException is SocketException se && se.ErrorCode == (int)SocketError.ConnectionRefused)
{
if (!failedOver)
{
Log("Endpoint is unavailable. Switching to failover url");
url = failOverUrl;
failedOver = true;
}
else
{
Log("Failover Url is unavailable. Shutting down!");
return;
}
}
catch(TaskCanceledException)
{
Log($"Timed out. Will try again with a {extraTimeout} millisecond timeout");
timeout = extraTimeout;
}
await Task.Delay(TimeSpan.FromSeconds(5));
}
Code language: C# (cs)
Now run the weather client.
10:10:36.710 Getting weather
10:10:41.749 Timed out. Will try again with a 10000 millisecond timeout
10:10:46.750 Getting weather
10:10:52.765 StatusCode=OK Weather={"temperatureF":59,"description":"Sunny"}
Code language: plaintext (plaintext)
As you can see, it got a timeout as expected. Then it increased the timeout to 10 seconds and the second request was successful. If you look at the timestamps, you’ll notice it took ~6 seconds to get the response.
10 – Use the toxiproxy client to remove the timeout behavior
First, inspect the weather proxy to see what the toxic is called.
./client inspect weather
Code language: Bash (bash)
This gives the following output:
latency_downstream type=latency stream=downstream toxicity=1.00 attributes=[ jitter=0 latency=6000 ]
This shows the toxic is referred to as “latency_downstream,” so to remove it, execute the following command:
./client toxic remove weather -n latency_downstream
Code language: Bash (bash)
You’ll see the following response:
Removed toxic 'latency_downstream' on proxy 'weather'
Code language: plaintext (plaintext)
After removing this, you’ll notice that the weather client is back to normal and getting responses very quickly (a few milliseconds).