Problem statement: Sum two big integers that are passed in as strings. Return the sum as a string. In other words, implement the following method:
string Sum(string a, string b)
Code language: C# (cs)
Constraint: Don’t use the built-in BigInteger class (note: this is the name in C# and may have a different name in other languages). Do it the hard way instead. If you want to get physically strong, you have to lift heavy weights. If you want to strengthen your algorithm problem solving skills, you have to challenge yourself and do things the hard way.
Note: For simplicity, the strings passed in will always be valid integers >= 0.
Table of Contents
Approach
Why is it difficult to add two big integers?
This problem may seem simple at first. How can it be hard to add two numbers?
First, the longest number you can store is 18446744073709551615 (2^64 – 1) – using a 64-bit unsigned integer (ulong).
Second, when you’re adding big integers, if the sum is larger than the largest number you can store, then it’ll overflow. Here’s an example:
ulong c = 18446744073709551615;
ulong d = 1;
var sum = c + d;
Console.WriteLine(sum);
Code language: C# (cs)
This outputs 0 because of integer overflow.
Third, we’ve given ourselves the constraint of not using BigInteger. If I had to solve this problem in a real world project, I would most likely use BigInteger (unless I needed to optimize for performance), like this:
BigInteger a = BigInteger.Parse("18446744073709551615");
BigInteger b = BigInteger.Parse("18446744073709551615");
var sum = (a + b).ToString();
Console.WriteLine(sum);
Code language: C# (cs)
This outputs 36893488147419103230 as expected.
Using BigInteger makes this problem trivial and eliminates the challenge.
How do you add numbers manually?
Write 1234 + 5678 on paper and solve it. Chances are you did the same approach I was taught in school.
First, line up the two numbers:
1234
+ 5678
--------
Add the right-most digits together: 4 + 8 = 12.
Since 12 is greater than 9, and we can only have single digits in the result, we have carry a 1 to the left digits and subtract 10 from 12 = 2 and leave this under the right-most digits.
Now move to the digits to the left.
1
1234
+ 5678
--------
2
We have 3 + 7 + 1 (remember we carried a 1 from adding 8 + 4) = 11.
11 is greater than 9, so we have to carry a 1 to the left again, and subtract 10 from 11, leaving a 1 for the second digit.
Now move to the digits to the left.
1
1234
+ 5678
--------
12
We have 2 + 6 + 1 (the carried digit) = 9. Since this is a single digit, we don’t need to carry, and can simply put 9 for the third digit.
Move to the final digits on the left.
1234
+ 5678
--------
912
Add 5 + 1 = 6.
1234
+ 5678
--------
6912
This manual approach is an algorithm and can be represented by the following pseudocode:
sum = ""
carry = 0
loop from right to left using loop variable i
digitSum = a[i] + b[i] + carry
if digitSum is greater than 9
carry = 1
digitSum = digitSum - 10
else
carry = 0
prepend sum with digitSum
Code language: plaintext (plaintext)
In the next section, we’ll add more test cases and refine the pseudocode.
Test cases
When developing an algorithm, it helps to start with at least one test case. From there, you can add more test cases to refine the algorithm and make sure it works in general.
The first test case was shown above: When given “1234” and “5678”, expect the sum to be “6912”.
Here are more test cases:
Input | Expected sum |
---|---|
“0” and “0” | “0” This tests the lower bound of the input range. |
“18446744073709551615” and “18446744073709551615” | “36893488147419103230” This is an upper bound test and proves the code can handle integers larger than an 64-bit unsigned integer (ulong) can hold. Without this test, all the other tests could pass with the code internally doing: return Convert.ToUInt64(a) + Convert.ToUInt64(b) |
“10” and “1” | “11” This tests that the code can handle input of different lengths. If the pseudocode (as presented above) were implemented and passed this input, it would throw IndexOutOfRangeException. |
“9” and “1” | “10” This tests what happens when a carry digit is left after you’ve looped over all the other digits. If the pseudocode (as presented above) were implemented and passed this, it would return “0”, because the carry digit from adding 9+1 would be lost. |
The pseudocode needs to be updated to handle some of these test cases we added.
sum = ""
carry = 0
pad a and b with 0's so they are the same length
loop from right to left using loop variable i
digitSum = a[i] + b[i] + carry
if digitSum is greater than 9
carry = 1
digitSum = digitSum - 10
else
carry = 0
prepend sum with digitSum
if carry is 1
prepend sum with carry
Code language: plaintext (plaintext)
Code
First, all of the test cases can be written in a single parameterized unit test:
[DataRow("0", "0", "0")]
[DataRow("1234", "5678", "6912")]
[DataRow("18446744073709551615", "18446744073709551615", "36893488147419103230")]
[DataRow("10", "1", "11")]
[DataRow("9", "1", "10")]
[TestMethod()]
public void SumTest(string a, string b, string expectedSum)
{
//act
var actualSum = MathUtil.Sum(a, b);
//assert
Assert.AreEqual(expectedSum, actualSum);
}
Code language: C# (cs)
Here’s the code that implements the algorithm. This is almost a 1-to-1 translation from the pseudocode to C# code:
public static string Sum(string a, string b)
{
var sum = new StringBuilder();
int carry = 0;
if (a.Length != b.Length)
{
var maxLength = Math.Max(a.Length, b.Length);
a = a.PadLeft(maxLength, '0');
b = b.PadLeft(maxLength, '0');
}
for (int i = a.Length - 1; i >= 0; i--)
{
var digitSum = (a[i] - '0') + (b[i] - '0') + carry;
if (digitSum > 9)
{
carry = 1;
digitSum -= 10;
}
else
{
carry = 0;
}
sum.Insert(0, digitSum);
}
if (carry == 1)
sum.Insert(0, carry);
return sum.ToString();
}
Code language: C# (cs)
A few notes:
- Because the input is passed in as strings, each digit is a character. To get the integer digit, you do this:
(a[i] - '0')
Code language: JavaScript (javascript)
- This is using StringBuilder to avoid appending strings together inside the loop (for performance reasons).
Performance vs BigInteger approach
Using BigInteger is the easy way to solve this problem. It reduces down a to single line:
public static string Sum(string a, string b)
{
return (BigInteger.Parse(a) + BigInteger.Parse(b)).ToString();
}
Code language: C# (cs)
I used the following code to compare the performance using strings with 100,001 digits:
var sw = new Stopwatch();
sw.Start();
var a = "1" + new string('0', 100000);
var b = "1" + new string('0', 100000);
var expectedSum = "2" + new string('0', 100000);
//act
var actualSum = MathUtil.Sum(a, b);
//assert
Assert.AreEqual(expectedSum, actualSum);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Code language: C# (cs)
The algorithm in this article took 1800 milliseconds, whereas the BigInteger approach took 4500 milliseconds, which means our algorithm is 2.5x faster than the BigInteger approach. It’s easier to use BigInteger, but it’s slower.