C# – How to use SqlBulkCopy to do a Bulk Insert

If you find yourself doing SQL Inserts in a loop, you may want to consider doing a Bulk Insert instead. You can use SqlBulkCopy to do a Bulk Insert from C#.

Make sure to prove you have a performance problem before switching your code to use Bulk Insert. Like any optimization, it adds complexity.

In this article, I’ll show how to use SqlBulkCopy. At the end, I’ll show a performance comparison between doing multiple inserts vs one Bulk Insert.

Using SqlBulkCopy

The simplest way to use SqlBulkCopy is to pass it a populated DataTable, like this:

public static void BulkInsert(DataTable table) { using (var bulkInsert = new SqlBulkCopy(GetConnectionString())) { bulkInsert.DestinationTableName = table.TableName; bulkInsert.WriteToServer(table); } }
Code language: C# (cs)

Populating a DataTable

To populate a DataTable, you need to specify the column names and types, and then add rows, like this:

DataTable table = new DataTable(); table.TableName = "People"; table.Columns.Add(nameof(Person.Name), typeof(string)); table.Columns.Add(nameof(Person.BirthDate), typeof(DateTime)); foreach (var person in people) { var row = table.NewRow(); row[nameof(Person.Name)] = person.Name; row[nameof(Person.BirthDate)] = person.BirthDate ?? (object)DBNull.Value; table.Rows.Add(row); }
Code language: C# (cs)

This can get tedious if you have lots of columns. You may want to consider generalizing this and using reflection to automatically populate a DataTable from a collection of objects. Keep it simple and do what works for you.

Two things to watch out for

  • Nulls: If your object is null, you need to pass in DBNull.Value instead of a regular null.
  • Min DateTime: You can’t pass in DateTime.MinValue. Otherwise you’ll get the following exception:

System.Data.SqlTypes.SqlTypeException: SqlDateTime overflow. Must be between 1/1/1753 12:00:00 AM and 12/31/9999 11:59:59 PM.

Instead of DateTime.MinValue, pass in System.Data.SqlTypes.SqlDateTime.MinValue.Value.

Performance comparison: Multiple Inserts vs one Bulk Insert

To compare the performance, I’m going to insert 100,000 records into the People table. This table is defined in the section below. It has a primary key and two indexes.

I’ll use a Stopwatch in the code and use SQL Profiler to be able to compare the CPU time and number of IO reads/writes.

In between each test run, I’ll clear out the table with DELETE FROM.

People table definition

Here is the People table definition:

CREATE TABLE [dbo].[People]( [Id] [nvarchar](50) NOT NULL, [Name] [nvarchar](50) NOT NULL, [BirthDate] [datetime] NULL, [FavoriteMovie] [nvarchar](50) NULL, [FavoriteNumber] [int] NULL, CONSTRAINT [PK_People] PRIMARY KEY CLUSTERED ( [Id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
Code language: SQL (Structured Query Language) (sql)

It has the following two indexes:

CREATE NONCLUSTERED INDEX [IX_BirthDate] ON [dbo].[People] ( [BirthDate] ASC ) INCLUDE([Name],[Id]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] CREATE NONCLUSTERED INDEX [IX_FavMovie] ON [dbo].[People] ( [FavoriteMovie] ASC ) INCLUDE([Id],[Name]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
Code language: SQL (Structured Query Language) (sql)

Generating the test data

I’m using the following method to generate the list of 100,000 people:

private static List<Person> GenerateRecords() { var people = new List<Person>(); for (int i = 0; i < 100_000; i++) { DateTime? birthDate = DateTime.Now; string favoriteMovie = $"Movie{i}"; int? favoriteNumber = i; if (i % 10 == 0) { birthDate = null; favoriteMovie = null; favoriteNumber = null; } people.Add(new Person() { BirthDate = birthDate, FavoriteMovie = favoriteMovie, FavoriteNumber = favoriteNumber, Id = Guid.NewGuid(), Name = $"Person{i}" }); } return people; }
Code language: C# (cs)

Performance Test – Multiple Inserts

This is inserting 100,000 records into the People table. It’s using Dapper to insert the list of people.

static void Main(string[] args) { List<Person> people = GenerateRecords(); Stopwatch sw = new Stopwatch(); sw.Start(); MultipleInserts(people); sw.Stop(); Console.WriteLine($"Elapsed={sw.Elapsed}. TotalMS={sw.Elapsed.TotalMilliseconds}"); } public static void MultipleInserts(List<Person> people) { using (var con = new SqlConnection(GetConnectionString())) { con.Execute(@"INSERT INTO [dbo].[People] ([Id] ,[Name] ,[BirthDate] ,[FavoriteMovie] ,[FavoriteNumber]) VALUES (@Id ,@Name ,@BirthDate ,@FavoriteMovie ,@FavoriteNumber)", people); } }
Code language: C# (cs)

This is executing 100,000 individual INSERT statements. Passing the people list to Dapper is the same as looping over the people list yourself and calling Execute() for each person object. You can confirm this by running SQL Profiler and observing that it’s doing 100,000 individual inserts.

Results

I measured the performance with a Stopwatch and with SQL Profiler.

Stopwatch results:

Elapsed=00:00:33.6690405. TotalMS=33669.0405
Code language: plaintext (plaintext)

SQL Profiler results:

CPU time (MS)Page Reads (IO)Page Writes (IO)Duration (MS)
11,6001,362,90110,37133,463

Performance Test – Bulk Insert

This is using SqlBulkCopy to execute a single Bulk Insert. It’s inserting all 100,000 records at once.

static void Main(string[] args) { List<Person> people = GenerateRecords(); Stopwatch sw = new Stopwatch(); sw.Start(); BulkInsert(people); sw.Stop(); Console.WriteLine($"Elapsed={sw.Elapsed}. TotalMS={sw.Elapsed.TotalMilliseconds}"); } public static void BulkInsert(List<Person> people) { DataTable table = new DataTable(); table.TableName = "People"; table.Columns.Add(nameof(Person.Id), typeof(string)); table.Columns.Add(nameof(Person.Name), typeof(string)); table.Columns.Add(nameof(Person.BirthDate), typeof(DateTime)); table.Columns.Add(nameof(Person.FavoriteMovie), typeof(string)); table.Columns.Add(nameof(Person.FavoriteNumber), typeof(int)); foreach (var person in people) { var row = table.NewRow(); row[nameof(Person.Id)] = person.Id.ToString(); row[nameof(Person.Name)] = person.Name; row[nameof(Person.BirthDate)] = GetDBValue(person.BirthDate); row[nameof(Person.FavoriteMovie)] = GetDBValue(person.FavoriteMovie); row[nameof(Person.FavoriteNumber)] = GetDBValue(person.FavoriteNumber); table.Rows.Add(row); } using (var bulkInsert = new SqlBulkCopy(GetConnectionString())) { bulkInsert.DestinationTableName = table.TableName; bulkInsert.WriteToServer(table); } } public static object GetDBValue(object o) { return o ?? (object)DBNull.Value; }
Code language: C# (cs)

Results

I measured the performance with a Stopwatch and with SQL Profiler.

Stopwatch results:

Elapsed=00:00:02.0443607. TotalMS=2044.3607
Code language: plaintext (plaintext)

SQL Profiler results:

CPU time (MS)Page Reads (IO)Page Writes (IO)Duration (MS)
1,156669,9966,8051,603

Conclusion

Bulk Insert performed much, much better than multiple inserts, as the following summary table shows:

Stopwatch (MS)CPU time (MS)Page Reads (IO)Page Writes (IO)Duration (MS)
Multiple Inserts33,66911,6001,362,90110,37133,463
Bulk Insert2,0441,156669,9966,8051,603

Bulk Insert was 15-20 times faster, used 10x less CPU time, and did ~50% fewer IO operations.

The more records you’re trying to insert, the better Bulk Insert performs compared to multiple inserts. I ran another performance test using just 1,000 records and Bulk Insert 2x faster.

And this is only comparing the performance of the insert operation. When you consider the system as a whole, inefficient queries can really degrade performance. This is where Bulk Insert really shines.

Leave a Comment