Codx

EF Core simple performance optimization (Part #1)

Cover Image for EF Core simple performance optimization (Part #1)
Péter Magyar
Péter Magyar

This is a multi part tips and this is the first part of it.


When we are working on an application most of the time we are facing the issue, that some of codebase are not very performant. It can be simply fixed if we just restructure our queries and separate the write and read functions. To achieve that here are a few tips:

AsNoTracking()

Please not that Keyless entity types are never tracked. Wherever this article mentions entity types, it refers to entity types which have a key defined.

It really fits for ready-only scenarios, in general it is quicker and the reason for that is no need to set up a change tracking for it.

var blogs = context.Blogs
.AsNoTracking()
.ToList();

You can configure your context as the AsNoTracking behaviour will be the default, you need to add the following code to you builder:


protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    optionsBuilder
        .UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=EFQuerying.Tracking;Trusted_Connection=True")
        .UseQueryTrackingBehavior(QueryTrackingBehavior.NoTracking);
}

It does not mean that from now on you only have the AsNoTracking(), where it is needed you can add the AsTracking() to be able to track those specifiy queries.

Loading types

With EF Core you have the ability to use the following loading types:

  • Eager loading - data loaded in the initial state
  • Lazy loading - data loaded from the database when the navigation porperty is accessed
  • Explicit loading - data loaded explicitly from the database at a later time.

Split Queries

In my opinion this is a great solution for big single queries. With that you'll be able to chunk that query into smaller ones. Why this is important? Probably you heard about the Cartasina Explosion. Let's give you an example. If you have a table and joins with another table they'll on the same level, so the database engine will join each row with each ro (cross product). Meaning that if you have 10 rows in table a and 10 rows in table b then you will have a result set which contains a 100 rows. This can be a major perfomance issue if you try to use more Includes (Joins) in your query.

Another issue can be happen with single queries is data duplication. What this is really means?

Let's have another example to showcase the problem.

You have a blog and post tables. So you will have a duplication regarding the blog properties in each posts that the blog has. It occures a big perfomance issue when you have columns like binary data, huge text in the blog table. (In general in you don't have this kind of data stored in you blog table, then is won't occure any perf. issue)

So the take away with this is to query just the needed infomration to more to keep the performance level optimal.

And we arrived the split queries work around. EF allows you to split that single query instead of using Joins, split queries generate additional SQL query for each included collection navigation:

using (var context = new BloggingContext())
{
    var blogs = context.Blogs
        .Include(blog => blog.Posts)
        .AsSplitQuery()
        .ToList();
}

It will create the following native sql:

SELECT [b].[BlogId], [b].[OwnerId], [b].[Rating], [b].[Url]
FROM [Blogs] AS [b]
ORDER BY [b].[BlogId]

SELECT [p].[PostId], [p].[AuthorId], [p].[BlogId], [p].[Content], [p].[Rating], [p].[Title], [b].[BlogId]
FROM [Blogs] AS [b]
INNER JOIN [Posts] AS [p] ON [b].[BlogId] = [p].[BlogId]
ORDER BY [b].[BlogId]

Please keep in mind if you queries containts Skip and Take (Offset pagination) you need to give special attention to make your query order fully unique! If not you can get back completely wrong data.

Pagination

Another neat function that we can use Skip and Take to create pagination within in EF Core.

You can use:

Offset Pagination - Probably the most common way to implement the pagination to use the skip and take.

This technique very intuitive and common there are a few setbacks.

  • If we skip the first 20 row we need still have to process the first 20 rows, even if we don't need them. Probably you might know that this is increasing the computation.
  • If any updates occur concurrently, the pagination may end up skipping certain entries or showing them twice.
var position = 20;
var nextPage = context.Posts
    .OrderBy(b => b.PostId)
    .Skip(position)
    .Take(10)
    .ToList();

Keyset Pagination -- or seek based pagination

This is a recommended alternative to off-set pagination. It is simply uses the where clause to skip the rows, instead of an offset, so with that you can "jump" over the not necessarry rows. It is efficient if we are assuming that we defined the PostId and above that it is not sensitive to any concurrent changes happening with lower Ids (values)

It is good for:

  • Backwards and Forwards pagination

There is one setback is when you want to jump to a specific page you need to use the off-set pagination.

Note that, always take into account to use off-set pagination and it is really necessary to implement. When the answer is yes, please make sure you are building a robust mechanism to solve the given business issue.

Multiple pagination keys

If you are using the keyset pagination, frequentlty necessarry to order by mor than one prop. For i.e:

var lastDate = new DateTime(2020, 1, 1);
var lastId = 55;
var nextPage = context.Posts
    .OrderBy(b => b.Date)
    .ThenBy(b => b.PostId)
    .Where(b => b.Date > lastDate || (b.Date == lastDate && b.PostId > lastId))
    .Take(10)
    .ToList();

Most SQL databases support a simpler and more efficient version of the above, using row values: WHERE (Date, Id) > (@lastDate, @lastId). EF Core does not currently support expressing this in LINQ queries, this is tracked by #26822.

This one has been opened since Nov 24, 2021, so don't be so sure that this will be solved pretty soon.

Mappings / Raw SQL Queries

Todays big question is do we need to use some sort of mapping function to reduce the query size etc. I can say that it depends. but most of the time I prefer to do by manually the mappings.

There are a few recommended like:

  • Mapster
  • AutoMapper

But after EF Core 8, I think we had the best of both worlds, a new feature called Raw SQL queries for unmapped types were introduced.

The type used must have a property for every value in the result set, but do not need to match any table in the database. For example, the following type represents only a subset of information for each post, and includes the blog name, which comes from the table.


That's it for the first part, there are a tons os wort to mention tips and optimizations in EF Core. Stay tuned for the next part!