C# – Remove a list of characters from a string

When you want to remove a list of characters from a string, loop through the list and use string.Replace():

var removalList = new List<char> { 'e', 'o' };
var input = "Hello World";
var cleanedInput = input;

foreach (char c in removalList)
{
    cleanedInput = cleanedInput.Replace(c.ToString(), string.Empty);
}

Console.WriteLine($"Before: {input}");
Console.WriteLine($"After: {cleanedInput}");
Code language: C# (cs)

Note that string.Replace() returns a new string (because strings are immutable).

Running this outputs the following:

Before: Hello World
After: Hll WrldCode language: plaintext (plaintext)

This is the fastest approach (in .NET 6+).

Linq approach: Where() + ToArray() + new string()

Another option for removing a list of characters is to use a Linq one-liner:

  • Use Where() on the string to remove the characters you don’t want. This gives you an IEnumerable<char>.
  • Use ToArray() to convert this to a char array
  • Use new string() to convert this to a string.

Here’s the code:

using System.Linq;

var removalList = new List<char> { 'e', 'o' };
var input = "Hello World";

var cleanedInput = new string(input.Where(c => !removalList.Contains(c)).ToArray());

Console.WriteLine($"Before: {input}");
Console.WriteLine($"After: {cleanedInput}");
Code language: C# (cs)

This outputs the following:

Before: Hello World
After: Hll WrldCode language: plaintext (plaintext)

This is 2x slower than the fastest approach, but it’s a one-liner, which is appealing in some cases.

StringBuilder + loop (fastest before .NET 6)

Before .NET 6, the fastest option was to loop through the string and add characters to keep (i.e. not in the removal list) to a StringBuilder). So if you’re in a version before .NET 6, do this approach.

Here’s an example of how to do that:

using System.Text;

var removalList = new List<char> { 'e', 'o' };

string input = "Hello World";

var sb = new StringBuilder();

foreach (var c in input)
{
    if (!removalList.Contains(c))
        sb.Append(c);
}

string cleanedInput = sb.ToString();

Console.WriteLine($"Before: {input}");
Console.WriteLine($"After: {cleanedInput}");

Code language: C# (cs)

This outputs the following:

Before: Hello World
After: Hll WrldCode language: plaintext (plaintext)

Performance comparison results

I showed three options for removing a list of characters from a string. I didn’t show how to do this with regex because it’s by far the slowest approach. To compare the performance, I ran 100k iterations removing a list of 15 characters from a string containing 2.5k characters. The following table summarizes the performance comparison:

ApproachAvg (ms)Min (ms)Max (ms)
string.Replace() in a loop0.030.021.32
StringBuilder in a loop0.040.034.35
Linq Where() + new string() + ToArray()0.060.044.19
Regex0.090.0616.58

List<char> is faster than using HashSet<char>

One surprising result is that List<char> is faster than HashSet<char> in every approach I compared. However, in every case, I used a list of only 15 characters. With so few characters, the overhead costs of the HashSet<char> don’t outweigh its benefits. As the number of characters increases, I would expect HashSet<char> to eventually outperform List<char>.

Comments are closed.