In this article, I’ll show how to use EF Core to aggregate data for the whole table, per group, and how to only include groups that meet a condition. I’ll show three different SQL aggregate functions – count, sum, and average.
In each scenario, I’ll show the LINQ query, the SQL query it generated, and the results of executing the query. I always use SQL Profiler to examine the generated SQL query and verify it for correctness, and I’d recommend you to do the same.
Table of Contents
Movies table sample data
In all of the examples, I’ll be using the following (very small) set of movie data.
Movie | YearOfRelease | BoxOfficeRevenue |
Office Space | 1999 | $12.2 million |
Mad Max: Fury Road | 2015 | $375.4 million |
John Wick | 2014 | $86 million |
The Matrix | 1999 | $463.5 million |
The Big Short | 2015 | $133.4 million |
The Revenant | 2015 | $533 million |
Godzilla | 2014 | $529 million |
Interstellar | 2014 | $701.8 million |
SELECT COUNT()
Select the number of movies.
Total count
Select the total number of movies in the Movies table:
using (var context = new StreamingServiceContext(connectionString))
{
var count = await context.Movies.CountAsync();
Console.WriteLine($"There are {count} movies");
}
Code language: C# (cs)
This generates the following SQL query:
SELECT COUNT(*)
FROM [Movies] AS [m]
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
There are 8 movies
Code language: plaintext (plaintext)
Count per group
Select the number of movies made per year:
using (var context = new StreamingServiceContext(connectionString))
{
var countPerGroup = await context.Movies
.GroupBy(t => t.YearOfRelease)
.Select(movieGroup => new { Year = movieGroup.Key, Count = movieGroup.Count() })
.ToListAsync();
foreach(var movieGroup in countPerGroup)
{
Console.WriteLine($"Year {movieGroup.Year} has {movieGroup.Count} movie(s)");
}
}
Code language: C# (cs)
This generates the following SQL query with a GROUP BY:
SELECT [m].[YearOfRelease] AS [Year], COUNT(*) AS [Count]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
Year 1999 has 2 movie(s)
Year 2014 has 3 movie(s)
Year 2015 has 3 movie(s)
Code language: plaintext (plaintext)
Count per group having condition
Select the number of movies per year, but only include years that had more than two movies:
using (var context = new StreamingServiceContext(connectionString))
{
var countPerGroup = await context.Movies
.GroupBy(t => t.YearOfRelease)
.Select(movieGroup => new { Year = movieGroup.Key, Count = movieGroup.Count() })
.Where(movieGroup => movieGroup.Count > 2)
.ToListAsync();
foreach (var movieGroup in countPerGroup)
{
Console.WriteLine($"Year {movieGroup.Year} has {movieGroup.Count} movie(s). ");
}
}
Code language: C# (cs)
This generates the following SQL query with a GROUP BY HAVING:
SELECT [m].[YearOfRelease] AS [Year], COUNT(*) AS [Count]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
HAVING COUNT(*) > 2
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
Year 2014 has 3 movie(s).
Year 2015 has 3 movie(s).
Code language: plaintext (plaintext)
Notice that it filtered out years where less than three movies were made.
SELECT SUM()
Select the sum of box office revenue for movies.
Total sum
Select the total box office revenue for all movies in the Movies table:
using (var context = new StreamingServiceContext(connectionString))
{
var sumOfRevenue = await context.Movies.SumAsync(t=>t.BoxOfficeRevenue);
Console.WriteLine($"The movies made {sumOfRevenue:C} total");
}
Code language: C# (cs)
This generates the following SQL query:
SELECT COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0)
FROM [Movies] AS [m]
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
The movies made $2,354,189,870.00 total
Code language: plaintext (plaintext)
Sum per group
Select the total box office revenue per year:
using (var context = new StreamingServiceContext(connectionString))
{
var sumPerGroup = await context.Movies
.GroupBy(t => t.YearOfRelease)
.Select(movieGroup => new
{
Year = movieGroup.Key,
SumOfRevenue = movieGroup.Sum(t => t.BoxOfficeRevenue)
})
.ToListAsync();
foreach (var movieGroup in sumPerGroup)
{
Console.WriteLine($"Movies in year {movieGroup.Year} made {movieGroup.SumOfRevenue:C}");
}
}
Code language: C# (cs)
This generates the following SQL query with a GROUP BY:
SELECT [m].[YearOfRelease] AS [Year], COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0) AS [SumOfRevenue]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
Movies in year 1999 made $475,700,000.00
Movies in year 2014 made $1,316,800,000.00
Movies in year 2015 made $1,041,800,000.00
Code language: plaintext (plaintext)
Sum per group having condition
Select total box office revenue per year, excluding years with less than $1 billion in revenue:
using (var context = new StreamingServiceContext(connectionString))
{
var sumPerGroup = await context.Movies
.GroupBy(t => t.YearOfRelease)
.Select(movieGroup => new
{
Year = movieGroup.Key,
SumOfRevenue = movieGroup.Sum(t => t.BoxOfficeRevenue)
})
.Where(movieGroup => movieGroup.SumOfRevenue >= 1_000_000_000)
.ToListAsync();
foreach (var movieGroup in sumPerGroup)
{
Console.WriteLine($"Movies in year {movieGroup.Year} made {movieGroup.SumOfRevenue:C}");
}
}
Code language: C# (cs)
This generates the following SQL query with a GROUP BY HAVING:
SELECT [m].[YearOfRelease] AS [Year], COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0) AS [SumOfRevenue]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
HAVING COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0) >= 1000000000.0
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
Movies in year 2014 made $1,316,800,000.00
Movies in year 2015 made $1,041,800,000.00
Code language: plaintext (plaintext)
Notice it filtered out the year with less than $1 billion in revenue.
SELECT AVG()
Select the average box office revenue.
Note: Even though this is just an example to show how to execute AVG(), it should be noted that average box office revenue is a bad statistic for movies. Movies operate in a winner-take-all environment, so the box office revenue does not have a Gaussian distribution, hence the average is meaningless.
Total average
Select the average box office revenue for all movies in the Movies table:
using (var context = new StreamingServiceContext(connectionString))
{
var average = await context.Movies.AverageAsync(t => t.BoxOfficeRevenue);
Console.WriteLine($"The movies made an average of {average:C}");
}
Code language: C# (cs)
This generates the following SQL query:
SELECT AVG([m].[BoxOfficeRevenue])
FROM [Movies] AS [m]
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
The movies made an average of $354,287,500.00
Code language: plaintext (plaintext)
Average per group
Select the average box office revenue made per year:
using (var context = new StreamingServiceContext(connectionString))
{
var averagePerGroup = await context.Movies
.GroupBy(t => t.YearOfRelease)
.Select(movieGroup => new
{
Year = movieGroup.Key,
AverageRevenue = movieGroup.Average(t => t.BoxOfficeRevenue)
})
.ToListAsync();
foreach (var movieGroup in averagePerGroup)
{
Console.WriteLine($"Movies in year {movieGroup.Year} made an average of {movieGroup.AverageRevenue:C}");
}
}
Code language: C# (cs)
This generates the following SQL query with a GROUP BY:
SELECT [m].[YearOfRelease] AS [Year], AVG([m].[BoxOfficeRevenue]) AS [AverageRevenue]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
Movies in year 1999 made an average of $237,850,000.00
Movies in year 2014 made an average of $438,933,333.33
Movies in year 2015 made an average of $347,266,666.67
Code language: plaintext (plaintext)
Average per group having condition
Select the average box office revenue per year, but exclude years where the average was less than $250 million:
using (var context = new StreamingServiceContext(connectionString))
{
var averagePerGroup = await context.Movies
.GroupBy(t => t.YearOfRelease)
.Select(movieGroup => new
{
Year = movieGroup.Key,
AverageRevenue = movieGroup.Average(t => t.BoxOfficeRevenue)
})
.Where(movieGroup => movieGroup.AverageRevenue >= 250_000_000)
.ToListAsync();
foreach (var movieGroup in averagePerGroup)
{
Console.WriteLine($"Movies in year {movieGroup.Year} made an average of {movieGroup.AverageRevenue:C}");
}
}
Code language: C# (cs)
This generates the following SQL query with a GROUP BY HAVING:
SELECT [m].[YearOfRelease] AS [Year], AVG([m].[BoxOfficeRevenue]) AS [AverageRevenue]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
HAVING AVG([m].[BoxOfficeRevenue]) >= 250000000.0
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
Movies in year 2014 made an average of $438,933,333.33
Movies in year 2015 made an average of $347,266,666.67
Code language: plaintext (plaintext)
Notice that it filtered out the year with less than an average of $250 million.
SELECT AVG(), COUNT(), and SUM()
What if you want to include multiple aggregates? You can add as many as you want.
The following selects the average and total box office revenue per year and how many movies were made in that year:
using (var context = new StreamingServiceContext(connectionString))
{
var statsPerGroup = await context.Movies
.GroupBy(t => t.YearOfRelease)
.Select(movieGroup => new
{
Year = movieGroup.Key,
AverageRevenue = movieGroup.Average(t => t.BoxOfficeRevenue),
SumOfRevenue = movieGroup.Sum(t => t.BoxOfficeRevenue),
Count = movieGroup.Count()
})
.ToListAsync();
foreach (var movieGroup in statsPerGroup)
{
Console.WriteLine($"There were {movieGroup.Count} movies in in year {movieGroup.Year}. They made an average of {movieGroup.AverageRevenue:C} and a total of {movieGroup.SumOfRevenue:C}");
}
}
Code language: C# (cs)
This generates the following SQL query with all three aggregate functions included:
SELECT [m].[YearOfRelease] AS [Year], AVG([m].[BoxOfficeRevenue]) AS [AverageRevenue], COALESCE(SUM([m].[BoxOfficeRevenue]), 0.0) AS [SumOfRevenue], COUNT(*) AS [Count]
FROM [Movies] AS [m]
GROUP BY [m].[YearOfRelease]
Code language: SQL (Structured Query Language) (sql)
Executing this outputs the following to the console:
There were 2 movies in in year 1999. They made an average of $237,850,000.00 and a total of $475,700,000.00
There were 3 movies in in year 2014. They made an average of $438,933,333.33 and a total of $1,316,800,000.00
There were 3 movies in in year 2015. They made an average of $347,266,666.67 and a total of $1,041,800,000.00
Code language: plaintext (plaintext)
My question applies for all examples, but to focus on just one:
The first line of the last example says “There were two movies in year 1999”
What if we want to display those two movies like this:
Office Space 1999 $12.2 million
The Matrix 1999 $463.5 million
There were 2 movies in the year 1999
Hi John,
To get the individual records, you’d use a regular SELECT + WHERE query (instead of an aggregate SELECT query as shown in this article). You may be interested in checking out this other article about regular SELECT queries: EF Core – Basic SELECT queries.
To get the results you’re asking about, you’d do the following:
int year = 1999;
using (var context = new StreamingServiceContext(connectionString))
{
var movies = await context.Movies
.Where(m => m.YearOfRelease == year)
.AsNoTracking()
.ToListAsync();
foreach (var movie in movies)
{
Console.WriteLine($"{movie.Name} {movie.YearOfRelease} {movie.BoxOfficeRevenue/1_000_000:C} million");
}
Console.WriteLine($"There were {movies.Count} movies in the year {year}");
}
This outputs:
Office Space 1999 $12.20 million
The Matrix 1999 $463.50 million
There were 2 movies in the year 1999