EF Core – Database schema changes

Anytime you change the definition of the database – from renaming a column to creating a table – it’s referred to as a database schema change. With EF Core, you deal with database schema changes by using migrations.

When you first create the database with EF Core, you create a migration that contains the initial definition of the database. As you make schema changes, you add new migrations and apply them on top of the existing migrations.

In this article, I’ll show examples of going through the database schema change process in a few different scenarios, including error scenarios that require customizing the migration.

Note: I’ll be using the dotnet ef tool for handling migrations. I’ll use the command line – dotnet ef database update – for applying the migration. I’m working in a dev environment, so this is for simplicity and brevity.

Database schema change process

The following is the list of steps involved in the database schema change process:

  • Make the schema change in the code.
  • Create a new migration.
  • Verify the correctness of the generated migration source code.
  • If there are problems with the migration:
    • Split schema changes into smaller migrations -OR- customize the migration to correct the problems.
  • Apply the migration programmatically (or with the dotnet ef tool).
  • Verify correctness in the database.

Now I’ll show examples of doing this process in various scenarios.

Example – Adding a new table

Let’s say you want to add a new table called Shows.

First, add a new model class called Show:

using System.ComponentModel.DataAnnotations;

public class Show
{
	[Key]
	public int Id { get; set; }

	[Required]
	[MaxLength(500)]
	public string Name { get; set; }

	[Required]
	[MaxLength(500)]
	public string Description { get; set; }

	[Required]
	public int NumberOfEpisodes { get; set; }

	[Required]
	public int NumberOfSeasons { get; set; }
	
	[Required]
	public int FirstYear { get; set; }
	
	public int? LastYear { get; set; }
}
Code language: C# (cs)

Then add a DbSet<Show> property to your DbContext class:

public class StreamingServiceContext : DbContext
{
	private readonly string ConnectionString;
	public StreamingServiceContext(string connectionString)
	{
		ConnectionString = connectionString;
	}
	protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
	{
		optionsBuilder.UseSqlServer(ConnectionString);
	}
	public DbSet<Movie> Movies { get; set; }
	public DbSet<Show> Shows { get; set; }
}
Code language: C# (cs)

Create the migration:

dotnet ef migrations add Database_v1
Code language: PowerShell (powershell)

Take a look at the generated migration source code in <timestamp>_Database_v1.cs and verify correctness:

public partial class Database_v1 : Migration
{
	protected override void Up(MigrationBuilder migrationBuilder)
	{
		migrationBuilder.CreateTable(
			name: "Shows",
			columns: table => new
			{
				Id = table.Column<int>(type: "int", nullable: false)
					.Annotation("SqlServer:Identity", "1, 1"),
				Name = table.Column<string>(type: "nvarchar(500)", maxLength: 500, nullable: false),
				Description = table.Column<string>(type: "nvarchar(500)", maxLength: 500, nullable: false),
				NumberOfEpisodes = table.Column<int>(type: "int", nullable: false),
				NumberOfSeasons = table.Column<int>(type: "int", nullable: false),
				FirstYear = table.Column<int>(type: "int", nullable: false),
				LastYear = table.Column<int>(type: "int", nullable: true)
			},
			constraints: table =>
			{
				table.PrimaryKey("PK_Shows", x => x.Id);
			});
	}
	//Down() not shown
}
Code language: C# (cs)

It looks correct, so apply the migration:

dotnot ef database update
Code language: PowerShell (powershell)

Now you’ll see the Shows table in the database. It’s a good idea to double-check the table definition in the database after the migration is applied.

Example – Incorrect migration leading to data loss and how to fix it

Always check the generated migration source code. This cannot be repeated enough. The migration generator is not perfect, as I will show below. Therefore, it’s always necessary to double-check the migration code.

Let’s say you have a Movies table and want to apply the following schema changes:

  • Column name change.
  • Add a new column.
  • Drop a column.

Let’s see what happens if you try to apply all of these changes at the same time.

First, apply the schema changes to the Movie model:

public class Movie
{
	[Key]
	public int Id { get; set; }

	[Required]
	[MaxLength(500)]
	public string Name { get; set; }

	[Required]
	public int YearOfRelease { get; set; } //changed name

	[Required]
	[MaxLength(500)]
	public string Description { get; set; }

	//dropped column
	//[Required]
	//public int RuntimeMinutes { get; set; }

	[MaxLength(100)]
	public string Director { get; set; }

	[Required]
	public decimal BoxOfficeRevenue { get; set; } //added new column
}
Code language: C# (cs)

Then generate the migration:

dotnet ef migrations add Database_v3
Code language: PowerShell (powershell)

You may notice the following warning (big red flag):

An operation was scaffolded that may result in the loss of data. Please review the migration for accuracy.Code language: plaintext (plaintext)

Take a look at the generated migration source code in <timestamp>_Database_v3.cs, and pay close attention to the highlighted parts:

public partial class Database_v3 : Migration
{
	protected override void Up(MigrationBuilder migrationBuilder)
	{
		migrationBuilder.DropColumn(
			name: "ReleaseYear",
			table: "Movies");

		migrationBuilder.RenameColumn(
			name: "RuntimeMinutes",
			table: "Movies",
			newName: "YearOfRelease");

		migrationBuilder.AddColumn<decimal>(
			name: "BoxOfficeRevenue",
			table: "Movies",
			type: "decimal(18,2)",
			nullable: false,
			defaultValue: 0m);
	}
	
	//not showing Down()
}
Code language: C# (cs)

It dropped the wrong column – ReleaseYear instead of RuntimeMinutes – and it renamed the wrong column – RuntimeMinutes instead of ReleaseYear.

Since the migration is wrong, remove it:

dotnet ef migrations remove
Code language: PowerShell (powershell)

If your table had existing data in it, and you actually applied this migration, you’d have irreparable data loss. You’d end up with a column called YearOfRelease that has the RuntimeMinutes data in it.

This irreparable data loss may not be a big deal in a dev environment, but remember that you’ll eventually be applying the migration in a production environment. This is why you need to always double-check the generated migration code before applying it.

To prevent data loss, minimize the number of schema changes per migration

As shown above, generated migrations can be totally wrong and lead to data loss.

A simple solution is to create multiple, small migrations. Instead of trying to combine many schema changes into one migration, only include schema changes that can go together. You can figure out which changes can go together by trial and error.

The following example shows this approach.

Small migration 1

Let’s try to combine these two schema changes:

  • Column name change.
  • Add a new column.

First, make the changes in the Movie model:

public class Movie
{
	[Key]
	public int Id { get; set; }

	[Required]
	[MaxLength(500)]
	public string Name { get; set; }

	[Required]
	public int YearOfRelease { get; set; } //changed name

	[Required]
	[MaxLength(500)]
	public string Description { get; set; }

	[Required]
	public int RuntimeMinutes { get; set; }

	[MaxLength(100)]
	public string Director { get; set; }

	[Required]
	public decimal BoxOfficeRevenue { get; set; } //added new column
}
Code language: C# (cs)

Then generate the migration:

dotnet ef migrations add Database_v3
Code language: PowerShell (powershell)

Take a look at the generated migration source code in <timestamp>_Database_v3.cs:

public partial class Database_v3 : Migration
{
	protected override void Up(MigrationBuilder migrationBuilder)
	{
		migrationBuilder.RenameColumn(
			name: "ReleaseYear",
			table: "Movies",
			newName: "YearOfRelease");

		migrationBuilder.AddColumn<decimal>(
			name: "BoxOfficeRevenue",
			table: "Movies",
			type: "decimal(18,2)",
			nullable: false,
			defaultValue: 0m);
	}
//Down() not shown
}
Code language: C# (cs)

This time the migration source code is correct. It’s renaming the ReleaseYear column to YearOfRelease, and adding the new BoxOfficeRevenue column.

Small migration 2

The remaining schema change we need to do is Drop a column.

Apply this change in the Movie model:

public class Movie
{
	[Key]
	public int Id { get; set; }

	[Required]
	[MaxLength(500)]
	public string Name { get; set; }

	[Required]
	public int YearOfRelease { get; set; }

	[Required]
	[MaxLength(500)]
	public string Description { get; set; }

	//dropped column
	//[Required]
	//public int RuntimeMinutes { get; set; }

	[MaxLength(100)]
	public string Director { get; set; }

	[Required]
	public decimal BoxOfficeRevenue { get; set; }
}
Code language: C# (cs)

Then generate a new migration:

dotnet ef migrations add Database_v4
Code language: PowerShell (powershell)

Take a look at the generated migration source code <timestamp>_Database_v4.cs:

public partial class Database_v4 : Migration
{
	protected override void Up(MigrationBuilder migrationBuilder)
	{
		migrationBuilder.DropColumn(
			name: "RuntimeMinutes",
			table: "Movies");
	}

	//Down() not shown
}
Code language: C# (cs)

This is correct. It’s dropping the RuntimeMinutes column.

Apply the two pending migrations

The two small migrations were created and verified for correctness. They are both pending migrations.

Take a look at the list of migrations:

dotnet ef migrations list
Code language: PowerShell (powershell)
20210314133726_Database_v0
20210315113855_Database_v1
20210316112804_Database_v2
20210316123742_Database_v3 (Pending)
20210316124316_Database_v4 (Pending)
Code language: plaintext (plaintext)

Now apply the two pending migrations:

dotnet ef database update
Code language: PowerShell (powershell)

Notice that this applied both migrations

Applying migration '20210316123742_Database_v3'.
Applying migration '20210316124316_Database_v4'.Code language: plaintext (plaintext)

Because the schema changes were done separately, the generated migrations were correct, and there was no data loss.

What happens when you change a table that has data?

There are many different schema changes you can do. Most of them can be applied to tables with existing data without problems. Don’t risk it though – always test your migrations in a dev environment against tables that have data in them.

With that said, there are some schema changes that will fail to apply to a table with data in them. When you run into this situation, you may be able to customize the migration to solve the problem. I’ll show two examples of this below.

Changing a nullable column to not nullable

When you try to change a nullable column to not allow nulls, and the table already has NULL values in that column, then the generated migration won’t handle it properly. You’ll run into this error:

Cannot insert the value NULL into column ‘Director’, table ‘StreamingService.dbo.Movies’; column does not allow nulls. UPDATE fails.

To solve this problem, you can customize the migration by updating nulls to a default value before it does the column change. I’ll show a full example of this scenario below.

Example of how to make a nullable not allow nulls

The Director column is currently nullable. To make it not nullable, add the [Required] attribute:

public class Movie
{
	[Key]
	public int Id { get; set; }

	[Required]
	[MaxLength(500)]
	public string Name { get; set; }

	[Required]
	public int YearOfRelease { get; set; }

	[Required]
	[MaxLength(500)]
	public string Description { get; set; }

	[Required] //required = doesn't allow nulls
	[MaxLength(100)]
	public string Director { get; set; }

	[Required]
	public decimal BoxOfficeRevenue { get; set; }
}
Code language: C# (cs)

Create the migration for this change:

dotnet ef migrations add Database_v5
Code language: PowerShell (powershell)

Take a look at the generated migration source code in <timestamp>_Database_v5.cs and customize it by executing an UPDATE statement with migrationBuilder.Sql():

public partial class Database_v5 : Migration
{
	protected override void Up(MigrationBuilder migrationBuilder)
	{
		migrationBuilder.Sql(@"UPDATE Movies SET Director = '' WHERE Director IS NULL");

		migrationBuilder.AlterColumn<string>(
			name: "Director",
			table: "Movies",
			type: "nvarchar(100)",
			maxLength: 100,
			nullable: false,
			defaultValue: "",
			oldClrType: typeof(string),
			oldType: "nvarchar(100)",
			oldMaxLength: 100,
			oldNullable: true);
	}

	//Down() not shown
}
Code language: C# (cs)

Apply the migration:

dotnet ef database update
Code language: PowerShell (powershell)

This was able to apply the migration without problems, because it replaced all nulls with empty strings, and then changed the column to not allow nulls.

Reducing the length of a string column

Let’s say you have existing strings that are 50 characters long, and you want to change the maximum length of this string column to 40 characters. When you try to apply this schema change, you’ll run into the following error:

String or binary data would be truncated.

First, make sure you’re OK with truncating the existing data in the table.

You can solve this problem by customizing the migration to truncate the string column before it does the column change.

Disclaimer: Doing this will result in data loss because you’ll be intentionally truncating the string column. Don’t do this if you don’t want to lose data.

Example of reducing the length of a string column

First, change the [MaxLength] attribute for the Description column:

public class Movie
{
	[Key]
	public int Id { get; set; }

	[Required]
	[MaxLength(500)]
	public string Name { get; set; }

	[Required]
	public int YearOfRelease { get; set; }

	[Required]
	[MaxLength(30)] //reduced from 500 to 30
	public string Description { get; set; }

	[Required]
	[MaxLength(100)]
	public string Director { get; set; }

	[Required]
	public decimal BoxOfficeRevenue { get; set; }
}
Code language: C# (cs)

Then create a new migration:

dotnet ef migrations add Database_v6

Now take a look at the generated migration source code in <timestamp>_Database_v6.cs and customize it by executing an UPDATE statement with migrationBuilder.Sql():

public partial class Database_v6 : Migration
{
	protected override void Up(MigrationBuilder migrationBuilder)
	{
		migrationBuilder.Sql(@"UPDATE Movies SET Description = LEFT(Description, 30) WHERE LEN(Description) > 30");

		migrationBuilder.AlterColumn<string>(
			name: "Description",
			table: "Movies",
			type: "nvarchar(30)",
			maxLength: 30,
			nullable: false,
			oldClrType: typeof(string),
			oldType: "nvarchar(500)",
			oldMaxLength: 500);
	}

	//Down() not shown
}
Code language: C# (cs)

Apply the migration:

dotnet ef database update
Code language: PowerShell (powershell)

This successfully applied the migration by first truncating the string column to the reduced length, then it changed the length of the column.

2 thoughts on “EF Core – Database schema changes”

  1. Hi. How do you migrate your database model from TPH (as in EF Core 3.1) to TPT (as in EF Core 6) where data from columns must be transferred to the child entity table columns?

    Reply
    • Hi Pieterjan,

      Concrete example data and definition
      Let’s start by defining a concrete example.
      1. public class abstract People { Id, Name }
      2. public class Programmer : People { Language }
      3. public class Driver : People { Car }

      Let’s add some sample data. Let’s say we have two people:
      1. Programmer { Id = 1, Name = “Bob”, Language = “C#” }
      2. Driver { Id = 2 Name = “Alice”, Car = “Honda” }

      Using the Table-per-Hierarchy (TPH) inheritance approach, we have a single table that has base class columns + subclass columns + a discriminator field (implicit):
      Id, Name, Discriminator, Language, Car
      1, Bob, Programmer, C#, null
      2, Alice, Driver, null, Honda

      You want to convert this to use the Table-per-Type (TPT) inheritance approach. In this approach, we’ll have three tables: People, Programmers, and Drivers.
      table People
      Id, Name
      1, Bob
      2, Alice

      table Programmers
      FK Id, Language
      1, C#

      table Drivers
      FK Id, Car
      2, Honda

      Now that we have a concrete example, and understand what the database looks right now (TPH) and what we want it to look like (TPT), we can do the migration step by step using the approach outlined in this article.

      Step 1 – Make the schema change in the code

      We need to convert to the TPT inheritance style. One way to do that is in the context.OnModelCreating():
      protected override void OnModelCreating(ModelBuilder modelBuilder)
      {
      modelBuilder.Entity<People>().ToTable("People");
      modelBuilder.Entity<Programmer>().ToTable("Programmers");
      modelBuilder.Entity<Driver>().ToTable("Drivers");
      }

      Step 2 – Create a new migration
      Execute the command to create a new migration: dotnet ef migrations add Convert_To_TPT

      This will generate a migration file like: /Migrations/20211206215242_Convert_To_TPT.cs

      Step 3 – Look at the generated migration file and modify it
      The generated migration file will definitely be wrong. The migrate Up() method will be doing the following:
      1. Drop the Programmer + Driver columns from the People table
      2. Create the Driver table
      3. Create the Programmer table

      We can use this as a starting point and modify it. Here’s what we need to do:
      1. Create the Driver table (existing – move up)
      2. Create the Programmer table (existing – move up)
      3. Copy data from the People table to the Driver table (new – manual SQL)

      migrationBuilder.Sql(@"INSERT INTO Drivers (Id, Car) SELECT Id, Car FROM People WHERE Discriminator='Driver'");

      4. Copy data from the People table to the Programmer table (new – manual SQL)

      migrationBuilder.Sql(@"INSERT INTO Programmers (Id, Language) SELECT Id, Language FROM People WHERE Discriminator='Programmer'");

      5. Drop the subclass columns from the People table (existing – move down)

      Step 4 – Apply the migration
      Warning: Be sure to try this is a test/dev environment first so you can verify correctness without the risk of losing production data

      Execute the command to apply the migration: dotnet ef database update

      The migration runs in a SQL transaction, so if there are any problems, it’ll roll back. If there’s no problems with applying the migration, go to the next step.

      Step 5 – Verify correctness in the database

      Look at the database to verify correctness. I see the following:
      table People
      Id, Name
      1, Bob
      2, Alice

      table Programmers
      FK Id, Language
      1, C#

      table Drivers
      FK Id, Car
      2, Honda

      This matches our desired outcome (mentioned at the top), so the migration is correct.

      This is a simplified example. If you run into problems while implementing the actual migration, feel free to let me know if you want help.

      NOTE: If you need to support undoing migrations, implement Down() as well. It will basically be TPT-to-TPH, which is undoing what migrate Up() is doing here. I’ll leave that as an exercise for the reader.

      Reply

Leave a Reply to Pieterjan Cancel reply