Algorithm Explained: Sum two big integers the hard way

Problem statement: Sum two big integers that are passed in as strings. Return the sum as a string. In other words, implement the following method:

string Sum(string a, string b)
Code language: C# (cs)

Constraint: Don’t use the built-in BigInteger class (note: this is the name in C# and may have a different name in other languages). Do it the hard way instead. If you want to get physically strong, you have to lift heavy weights. If you want to strengthen your algorithm problem solving skills, you have to challenge yourself and do things the hard way.

Note: For simplicity, the strings passed in will always be valid integers >= 0.

Approach

Why is it difficult to add two big integers?

This problem may seem simple at first. How can it be hard to add two numbers?

First, the longest number you can store is 18446744073709551615 (2^64 – 1) – using a 64-bit unsigned integer (ulong).

Second, when you’re adding big integers, if the sum is larger than the largest number you can store, then it’ll overflow. Here’s an example:

ulong c = 18446744073709551615;
ulong d = 1;
var sum = c + d;
Console.WriteLine(sum);
Code language: C# (cs)

This outputs 0 because of integer overflow.

Third, we’ve given ourselves the constraint of not using BigInteger. If I had to solve this problem in a real world project, I would most likely use BigInteger (unless I needed to optimize for performance), like this:

BigInteger a = BigInteger.Parse("18446744073709551615");
BigInteger b = BigInteger.Parse("18446744073709551615");
var sum = (a + b).ToString();
Console.WriteLine(sum);
Code language: C# (cs)

This outputs 36893488147419103230 as expected.

Using BigInteger makes this problem trivial and eliminates the challenge.

How do you add numbers manually?

Write 234 + 678 on paper and solve it. Chances are you did the same approach I was taught in school: line up the numbers and add the right-most digits together. If the sum is greater than 9, carry a 1 to the left and subtract 10 from the sum. Repeat until you’ve added all digits.

For example, here’s step by step how you’d add 234 + 678 using this approach:

1. Line up the numbers

234
678

2. Add the right-most digits

234
678 

4 + 8 = 12. Since this is greater than 9, carry a 1 to the next digits to the left and subtract 10 from 12, leaving 2.

  1
234
678
    2

3. Now move to the left and add the digits. Because there's a carry, you have to add it too.

  1
234
678
    2

In this case, it's adding 3 + 7 + 1 (carry digit) = 11.  Again, this is greater than 9, so carry a 1 to the left and subtract 10 from 1, leaving you with the following:

1  
234
678
  12

4. Now add the remaining digits, including the carry.

1  
234
678
  12

So add 2 + 6 + 1 (carry digit) = 9, leaving you with the final answer:
234
678
912

This manual approach is an algorithm and can be represented by the following pseudocode:

sum = ""
carry = 0

loop from right to left using loop variable i
    digitSum = a[i] + b[i] + carry

    if digitSum is greater than 9
       carry = 1
       digitSum = digitSum - 10
    else
       carry = 0

    prepend sum with digitSum
Code language: plaintext (plaintext)

In the next section, we’ll add more test cases and refine the pseudocode.

Test cases

When developing an algorithm, it helps to start with at least one test case. From there, you can add more test cases to refine the algorithm and make sure it works in general.

The first test case was shown above: When given “1234” and “5678”, expect the sum to be “6912”.

Here are more test cases:

InputExpected sum
“0” and “0”“0”

This tests the lower bound of the input range.
“18446744073709551615” and “18446744073709551615”“36893488147419103230”

This is an upper bound test and proves the code can handle integers larger than an 64-bit unsigned integer (ulong) can hold.

Without this test, all the other tests could pass with the code internally doing:
return Convert.ToUInt64(a) + Convert.ToUInt64(b)
“10” and “1”“11”

This tests that the code can handle input of different lengths.

If the pseudocode (as presented above) were implemented and passed this input, it would throw IndexOutOfRangeException.
“9” and “1”“10”

This tests what happens when a carry digit is left after you’ve looped over all the other digits.

If the pseudocode (as presented above) were implemented and passed this, it would return “0”, because the carry digit from adding 9+1 would be lost.

The pseudocode needs to be updated to handle some of these test cases we added.

sum = ""
carry = 0

pad a and b with 0's so they are the same length

loop from right to left using loop variable i
    digitSum = a[i] + b[i] + carry

    if digitSum is greater than 9
       carry = 1
       digitSum = digitSum - 10
    else
       carry = 0

    prepend sum with digitSum

if carry is 1
   prepend sum with carry
Code language: plaintext (plaintext)

Code

First, all of the test cases can be written in a single parameterized unit test:

[DataRow("0", "0", "0")]
[DataRow("1234", "5678", "6912")]
[DataRow("18446744073709551615", "18446744073709551615", "36893488147419103230")]
[DataRow("10", "1", "11")]
[DataRow("9", "1", "10")]
[TestMethod()]
public void SumTest(string a, string b, string expectedSum)
{
	//act
	var actualSum = MathUtil.Sum(a, b);

	//assert
	Assert.AreEqual(expectedSum, actualSum);
}
Code language: C# (cs)

Here’s the code that implements the algorithm. This is almost a 1-to-1 translation from the pseudocode to C# code:

public static string Sum(string a, string b)
{
	var sum = new StringBuilder();

	int carry = 0;

	if (a.Length != b.Length)
	{
		var maxLength = Math.Max(a.Length, b.Length);
		a = a.PadLeft(maxLength, '0');
		b = b.PadLeft(maxLength, '0');
	}

	for (int i = a.Length - 1; i >= 0; i--)
	{
		var digitSum = (a[i] - '0') + (b[i] - '0') + carry;

		if (digitSum > 9)
		{
			carry = 1;
			digitSum -= 10;
		}
		else
		{
			carry = 0;
		}

		sum.Insert(0, digitSum);
	}

	if (carry == 1)
		sum.Insert(0, carry);

	return sum.ToString();
}
Code language: C# (cs)

This loops through the strings, getting the integer values from the characters, and then adds them together. It sticks the sum of each pair to a StringBuilder (for performance reasons).

Performance vs BigInteger approach

Using BigInteger is the easy way to solve this problem. It reduces down a to single line:

public static string Sum(string a, string b)
{
	return (BigInteger.Parse(a) + BigInteger.Parse(b)).ToString();
}
Code language: C# (cs)

I used the following code to compare the performance using strings with 100,001 digits:

var sw = new Stopwatch();
sw.Start();
var a = "1" + new string('0', 100000);
var b = "1" + new string('0', 100000);
var expectedSum = "2" + new string('0', 100000);

//act
var actualSum = MathUtil.Sum(a, b);

//assert
Assert.AreEqual(expectedSum, actualSum);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Code language: C# (cs)

The algorithm in this article took 1800 milliseconds, whereas the BigInteger approach took 4500 milliseconds, which means our algorithm is 2.5x faster than the BigInteger approach. It’s easier to use BigInteger, but it’s slower.