EF Core – Aggregate SELECT queries

In this article, I’ll show how to use EF Core to aggregate data for the whole table, per group, and how to only include groups that meet a condition. I’ll show three different SQL aggregate functions – count, sum, and average.

In each scenario, I’ll show the LINQ query, the SQL query it generated, and the results of executing the query. I always use SQL Profiler to examine the generated SQL query and verify it for correctness, and I’d recommend you to do the same.

Movies table sample data

In all of the examples, I’ll be using the following (very small) set of movie data.

MovieYearOfReleaseBoxOfficeRevenue
Office Space1999$12.2 million
Mad Max: Fury Road2015$375.4 million
John Wick2014$86 million
The Matrix1999$463.5 million
The Big Short2015$133.4 million
The Revenant2015$533 million
Godzilla2014$529 million
Interstellar2014$701.8 million

SELECT COUNT()

Select the number of movies.

Total count

Select the total number of movies in the Movies table:

using (var context = new StreamingServiceContext(connectionString))
{
	var count = await context.Movies.CountAsync();

	Console.WriteLine($"There are {count} movies");
} 
Code language: C# (cs)

This generates the following SQL query:

SELECT COUNT(*)
FROM [Movies] AS [m]
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

There are 8 moviesCode language: plaintext (plaintext)

Count per group

Select the number of movies made per year:

using (var context = new StreamingServiceContext(connectionString))
{
	var countPerGroup = await context.Movies
		.GroupBy(t => t.YearOfRelease)
		.Select(movieGroup => new { Year = movieGroup.Key, Count = movieGroup.Count() })
		.ToListAsync();

	foreach(var movieGroup in countPerGroup)
	{
		Console.WriteLine($"Year {movieGroup.Year} has {movieGroup.Count} movie(s)");
	}
}   
Code language: C# (cs)

This generates the following SQL query with a GROUP BY:

SELECT [m].[YearOfRelease] AS [Year], COUNT(*) AS [Count]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

Year 1999 has 2 movie(s)
Year 2014 has 3 movie(s)
Year 2015 has 3 movie(s)Code language: plaintext (plaintext)

Count per group having condition

Select the number of movies per year, but only include years that had more than two movies:

using (var context = new StreamingServiceContext(connectionString))
{
	var countPerGroup = await context.Movies
		.GroupBy(t => t.YearOfRelease)
		.Select(movieGroup => new { Year = movieGroup.Key, Count = movieGroup.Count() })
		.Where(movieGroup => movieGroup.Count > 2)
		.ToListAsync();

	foreach (var movieGroup in countPerGroup)
	{
		Console.WriteLine($"Year {movieGroup.Year} has {movieGroup.Count} movie(s). ");
	}
}
Code language: C# (cs)

This generates the following SQL query with a GROUP BY HAVING:

SELECT [m].[YearOfRelease] AS [Year], COUNT(*) AS [Count]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
HAVING COUNT(*) > 2
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

Year 2014 has 3 movie(s).
Year 2015 has 3 movie(s).Code language: plaintext (plaintext)

Notice that it filtered out years where less than three movies were made.

SELECT SUM()

Select the sum of box office revenue for movies.

Total sum

Select the total box office revenue for all movies in the Movies table:

using (var context = new StreamingServiceContext(connectionString))
{
	var sumOfRevenue = await context.Movies.SumAsync(t=>t.BoxOfficeRevenue);

	Console.WriteLine($"The movies made {sumOfRevenue:C} total");
}    
Code language: C# (cs)

This generates the following SQL query:

SELECT COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0)
FROM [Movies] AS [m]
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

The movies made $2,354,189,870.00 totalCode language: plaintext (plaintext)

Sum per group

Select the total box office revenue per year:

using (var context = new StreamingServiceContext(connectionString))
{
	var sumPerGroup = await context.Movies
		.GroupBy(t => t.YearOfRelease)
		.Select(movieGroup => new 
		{ 
			Year = movieGroup.Key, 
			SumOfRevenue = movieGroup.Sum(t => t.BoxOfficeRevenue) 
		})
		.ToListAsync();

	foreach (var movieGroup in sumPerGroup)
	{
		Console.WriteLine($"Movies in year {movieGroup.Year} made {movieGroup.SumOfRevenue:C}");
	}
}
Code language: C# (cs)

This generates the following SQL query with a GROUP BY:

SELECT [m].[YearOfRelease] AS [Year], COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0) AS [SumOfRevenue]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

Movies in year 1999 made $475,700,000.00
Movies in year 2014 made $1,316,800,000.00
Movies in year 2015 made $1,041,800,000.00Code language: plaintext (plaintext)

Sum per group having condition

Select total box office revenue per year, excluding years with less than $1 billion in revenue:

using (var context = new StreamingServiceContext(connectionString))
{
	var sumPerGroup = await context.Movies
		.GroupBy(t => t.YearOfRelease)
		.Select(movieGroup => new 
		{ 
			Year = movieGroup.Key, 
			SumOfRevenue = movieGroup.Sum(t => t.BoxOfficeRevenue) 
		})
		.Where(movieGroup => movieGroup.SumOfRevenue >= 1_000_000_000)
		.ToListAsync();

	foreach (var movieGroup in sumPerGroup)
	{
		Console.WriteLine($"Movies in year {movieGroup.Year} made {movieGroup.SumOfRevenue:C}");
	}
}
Code language: C# (cs)

This generates the following SQL query with a GROUP BY HAVING:

SELECT [m].[YearOfRelease] AS [Year], COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0) AS [SumOfRevenue]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
HAVING COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0) >= 1000000000.0
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

Movies in year 2014 made $1,316,800,000.00
Movies in year 2015 made $1,041,800,000.00Code language: plaintext (plaintext)

Notice it filtered out the year with less than $1 billion in revenue.

SELECT AVG()

Select the average box office revenue.

Note: Even though this is just an example to show how to execute AVG(), it should be noted that average box office revenue is a bad statistic for movies. Movies operate in a winner-take-all environment, so the box office revenue does not have a Gaussian distribution, hence the average is meaningless.

Total average

Select the average box office revenue for all movies in the Movies table:

using (var context = new StreamingServiceContext(connectionString))
{
	var average = await context.Movies.AverageAsync(t => t.BoxOfficeRevenue);

	Console.WriteLine($"The movies made an average of {average:C}");
}
Code language: C# (cs)

This generates the following SQL query:

SELECT AVG([m].[BoxOfficeRevenue])
FROM [Movies] AS [m]
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

The movies made an average of $354,287,500.00Code language: plaintext (plaintext)

Average per group

Select the average box office revenue made per year:

using (var context = new StreamingServiceContext(connectionString))
{
	var averagePerGroup = await context.Movies
		.GroupBy(t => t.YearOfRelease)
		.Select(movieGroup => new 
		{ 
			Year = movieGroup.Key, 
			AverageRevenue = movieGroup.Average(t => t.BoxOfficeRevenue) 
		})
		.ToListAsync();

	foreach (var movieGroup in averagePerGroup)
	{
		Console.WriteLine($"Movies in year {movieGroup.Year} made an average of {movieGroup.AverageRevenue:C}");
	}
}
Code language: C# (cs)

This generates the following SQL query with a GROUP BY:

SELECT [m].[YearOfRelease] AS [Year], AVG([m].[BoxOfficeRevenue]) AS [AverageRevenue]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

Movies in year 1999 made an average of $237,850,000.00
Movies in year 2014 made an average of $438,933,333.33
Movies in year 2015 made an average of $347,266,666.67Code language: plaintext (plaintext)

Average per group having condition

Select the average box office revenue per year, but exclude years where the average was less than $250 million:

using (var context = new StreamingServiceContext(connectionString))
{
	var averagePerGroup = await context.Movies
		.GroupBy(t => t.YearOfRelease)
		.Select(movieGroup => new 
		{ 
			Year = movieGroup.Key, 
			AverageRevenue = movieGroup.Average(t => t.BoxOfficeRevenue) 
		})
		.Where(movieGroup => movieGroup.AverageRevenue >= 250_000_000)
		.ToListAsync();

	foreach (var movieGroup in averagePerGroup)
	{
		Console.WriteLine($"Movies in year {movieGroup.Year} made an average of {movieGroup.AverageRevenue:C}");
	}
}
Code language: C# (cs)

This generates the following SQL query with a GROUP BY HAVING:

SELECT [m].[YearOfRelease] AS [Year], AVG([m].[BoxOfficeRevenue]) AS [AverageRevenue]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
HAVING AVG([m].[BoxOfficeRevenue]) >= 250000000.0
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

Movies in year 2014 made an average of $438,933,333.33
Movies in year 2015 made an average of $347,266,666.67Code language: plaintext (plaintext)

Notice that it filtered out the year with less than an average of $250 million.

SELECT AVG(), COUNT(), and SUM()

What if you want to include multiple aggregates? You can add as many as you want.

The following selects the average and total box office revenue per year and how many movies were made in that year:

using (var context = new StreamingServiceContext(connectionString))
{
	var statsPerGroup = await context.Movies
		.GroupBy(t => t.YearOfRelease)
		.Select(movieGroup => new 
		{
			Year = movieGroup.Key,
			AverageRevenue = movieGroup.Average(t => t.BoxOfficeRevenue),
			SumOfRevenue = movieGroup.Sum(t => t.BoxOfficeRevenue),
			Count = movieGroup.Count()
		})
		.ToListAsync();

	foreach (var movieGroup in statsPerGroup)
	{
		Console.WriteLine($"There were {movieGroup.Count} movies in in year {movieGroup.Year}. They made an average of {movieGroup.AverageRevenue:C} and a total of {movieGroup.SumOfRevenue:C}");
	}
}
Code language: C# (cs)

This generates the following SQL query with all three aggregate functions included:

SELECT [m].[YearOfRelease] AS [Year], AVG([m].[BoxOfficeRevenue]) AS [AverageRevenue], COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0) AS [SumOfRevenue], COUNT(*) AS [Count]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
Code language: SQL (Structured Query Language) (sql)

Executing this outputs the following to the console:

There were 2 movies in in year 1999. They made an average of $237,850,000.00 and a total of $475,700,000.00
There were 3 movies in in year 2014. They made an average of $438,933,333.33 and a total of $1,316,800,000.00
There were 3 movies in in year 2015. They made an average of $347,266,666.67 and a total of $1,041,800,000.00Code language: plaintext (plaintext)

Comments are closed.