Data and Devops practices and tools

Monday, June 8, 2020

Is the repository pattern useful with Entity Framework Core?

Original: Analysing whether Repository pattern useful with Entity Framework (May 2014).
First solution: Four months on – my solution to replacing the Repository pattern (Sept 2014).
THIS ARTICLE: Is the repository pattern useful with Entity Framework Core?
Architecture of Business Layer working with Entity Framework (Core and v6).
Creating Domain-Driven Design entity classes with Entity Framework Core.
GenericServices: A library to provide CRUD front-end services from a EF Core database.
Wrapping your business logic with anti-corruption layers – NET Core.

TL;DR – summary

No, the repository/unit-of-work pattern (shortened to Rep/UoW) isn’t useful with EF Core. EF Core already implements a Rep/UoW pattern, so layering another Rep/UoW pattern on top of EF Core isn’t helpful.

A better solution is to use EF Core directly, which allows you to use all of EF Core’s feature to produce high-performing database accesses.

The aims of this article

This article looks at

What people are saying about the Rep/UoW pattern with EF.
The pro and cons of using a Rep/UoW pattern with EF.
Three ways to replace the Rep/UoW pattern with EF Core code.
How to make your EF Core database access code easy to find and refactor.
A discussion on unit testing EF Core code.

I’m going assume you are familiar with C# code and either Entity Framework 6 (EF6.x) or Entity Framework Core library. I do talk specifically about EF Core, but most of the article is also relevant to EF6.x.

Setting the scene

In 2013 I started work on a large web application specifically for healthcare modelling. I used ASP.NET MVC4 and EF 5, which had just come out and supported SQL Spatial types which handles geographic data. At that time the prevalent database access pattern was a Rep/UoW pattern – see this article written by Microsoft in 2013 on database access using EF Core and the Rep/UoW pattern.

I built my application using Rep/UoW, but found it a real pain point during development. I was constantly having to ‘tweak’ the repository code to fix little problems, and each ‘tweak’ could break something else! It was this that made me research into how to better implement my database access code.

Coming more up to date, I was contracted by a start-up company at the end of 2017 to help with a performance issue with their EF6.x application. The main part of the performance issue turned out to be due to lazy loading, which was needed because the application used the Rep/UoW pattern.

It turns out that a programmer that helped start the project had used the Rep/UoW pattern. On talking to the founder of the company, who is very tech savvy, he said that he found the Rep/UoW part of the application was quite opaque and hard to work with.

What people are saying against the repository pattern

In researching as part of my review of the current Spatial Modeller™ design I found some blog posts that make a compelling case for ditching the repository. The most cogent and well thought-out post of this kind is ‘Repositories On Top UnitOfWork Are Not a Good Idea’. Rob Conery’s main point is that the Rep/UoW just duplicates what Entity Framework (EF) DbContext give you anyway, so why hide a perfectly good framework behind a façade that adds no value. What Rob calls ‘this over-abstraction silliness’.

Another blog is ‘Why Entity Framework renders the Repository pattern obsolete’. In this Isaac Abraham adds that repository doesn’t make testing any easier, which is one thing it was supposed to do. This is even truer with EF Core, as you will see later.

So, are they right?

My views on the pros and cons of repository/unit-of-work pattern

Let me try and review the pros/cons of the Rep/UoW pattern in as even-handed way as I can. Here are my views.

The good parts of the Rep/UoW pattern (best first)

Isolate your database code: The big plus of a repository pattern is that you know where all your database access code is. Also, you normally split your repository into sections, like the Catalogue Repository, the Order Processing Repository, etc which makes it easy to find the code a specific query that has a bug or needs performance tuning. That is definitely a big plus.
Aggregation: Domain Driven-Design (DDD) is a way to design systems, and it suggests that you have a root entity, with other associated entities grouped to it. The example I use in my book “Entity Framework Core in Action” is a Book entity with a collection of Review entities. The reviews only make sense when linked to a book, so DDD says you should only alter the Reviews via the Book entity. The Rep/UoW pattern does this by providing a method to add/remove reviews to the Book Repository.
Hiding complex T- SQL commands: Sometimes you need to bypass EF Core’s cleverness and use T-SQL. This type of access should be hidden from higher layers, yet easy to find to help with maintenance/refactoring. I should point out that Rob Conery’s post Command/Query Objects can also handle this.
Easy to mock/test: It is easy to mock an individual repository, which makes unit testing code that accesses the database easier. This was true some years ago, but nowadays this there are other ways around this problem, which I will describe later.

You will note that I haven’t listed “replacement of EF Core with another database access library”. This is one of the ideas behind the Rep/UoW, but my view it’s a misconception, because a) it’s very hard replace a database access library, and b) are you really going to swap such a key library in your application? You wouldn’t put up a facade around ASP.NET or React.js, so why do that to your database access library?

The bad parts of the Rep/UoW pattern (worst first)

The first three items are all around performance. I’m not saying you can’t write an efficient Rep/UoW’s, but its hard work and I see many implementations that have built-in performance issues (including Microsoft’s old Rep/UoW’s implementation). Here is my list of the bad issues I find with the Rep/UoW pattern:

1. Performance – handling relationships: A repository normally returns a IEnumerable /IQueryable result of one type, for instance in the Microsoft example, a Student entity class. Say you want to show information from a relationship that the Student has, such as their address? In that case the easiest way in a repository is to use lazy loading to read the students’ address entity in, and I see people doing this a lot. The problem is lazy loading causes a separate round-trip to the database for every relationship that it loads, which is slower than combining all your database accesses into one database round-trip. (The alternative is to have multiple query methods with different returns, but that makes your repository very large and cumbersome – see point 4).
2. Data not in the required format: Because the repository assembly is normally created near to the database assembly the data returned might not be in the exact format the service or user needs. You might be able to adapt the repository output, but its a second stage you have to write. I think it is much better to form your query closer to the front-end and include any adaption of the data you need (see more on this in the section “Service Layer” in one of my articles).
3. Performance – update: Many Rep/UoW implementations try to hide EF Core, and in doing so don’t make use of all its features. For instance, a Rep/UoW would update an entity using the EF Core’ Update method, which save every property in the entity. Whereas, using EF Core’s built-in change tracking features it will only update the properties that have changed.
4. Too generic: “The more reusable the code is, the less usable it is.” Neil Ford, from the book Building evolutionary architectures.The allure of the Rep/UoW comes from the view that you can write one, generic repository then you use that to build all your sub-repositories, for instance Catalogue Repository, Order Processing Repository, etc. That should minimise the code you need to write, but my experience is that a generic repository works at the beginning, but as things get more complex you end up having to add more and more code to each individual repository.

To sum up the bad parts – a Rep/UoW hides EF Core, which means you can’t use EF Core’s features to produce simple, but efficient database access code.

How to use EF Core, but still benefit from the good parts of the Rep/UoW pattern

In the previous “good parts” section I listed isolation, aggregation, hiding, and unit testing, which a Rep/UoW did well. In this section I’m going to talk about a number different software patterns and practices which, when combined with a good architectural design, provides the same isolation, aggregation, etc. features when you are using EF Core directly.

I will explain each one and then pull them together in a layered software architecture.

1. Query objects: a way to isolate and hide database read code.

Database accessed can be broken down into four types: Create, Read, Update and Delete – known as CRUD. For me the read part, known as a query in EF Core, are often the hardest to build and performance tune. Many applications rely on good, fast queries such as, a list of products to buy, a list of things to do, and so on. The answer that people have come up with is query objects.

I first came across them in 2013 in Rob Conery’s article (mentioned earlier), where he refers to Command/Query Objects. Also, Jimmy Bogard produced post in 2012 called ‘Favor query objects over repositories’. Using .NET’s IQueryable type and extension methods then we can improve the query object pattern over Rob and Jimmy’s examples.

The listing below gives a simple example of a query object that can select the order in which a list of integers is sorted.

public static class MyLinqExtension 
{
    public static IQueryable<int> MyOrder 
        (this IQueryable<int> queryable, bool ascending) 
    {
        return ascending
            ? queryable.OrderBy(num => num)
            : queryable.OrderByDescending(num => num); 
    }
}

And here is an example of how the MyOrder query object is called.

var numsQ = new[] { 1, 5, 4, 2, 3 }.AsQueryable();
 
var result = numsQ
    .MyOrder(true)   
    .Where(x => x > 3)  
    .ToArray();

The MyOrder query object works because the IQueryable type holds a list of commands, which are executed when I apply the ToArray method. In my simple example I’m not using a database, but if we replaced the numsQ variable with a DbSet<T> property from the application’s DbContext, then the commands in the IQueryable<T> type would be converted to database commands.

Because the IQueryable<T> type isn’t executed until the end, you can chain multiple query objects together. Let me give you a more complex example of a database query from my book “Entity Framework Core in Action”. In the code below uses four query objects chained together to select, sort, filter and page the data on some books. You can see this in action on the live site http://efcoreinaction.com/.

public IQueryable<BookListDto> SortFilterPage
    (SortFilterPageOptions options)
{
    var booksQuery = _context.Books          
        .AsNoTracking()                      
        .MapBookToDto()                      
        .OrderBooksBy(options.OrderByOptions)
        .FilterBooksBy(options.FilterBy,     
                       options.FilterValue); 
 
    options.SetupRestOfDto(booksQuery);      
 
    return booksQuery.Page(options.PageNum-1,
                           options.PageSize);
}

Query objects provides even better isolation than the Rep/UoW pattern because you can split up complex queries into a series of query objects that you can chain together. This makes it easier to write/understand, refactor and test. Also, if you have a query that needs raw SQL you can either use EF Core’s FromSql method, which returns IQueryable<T> too.

2. Approaches to handling Create, Update and Delete database accesses

The query objects handle the read part of the CRUD, but what about the Create, Update and Delete parts, where you write to the database? I’m going to show you two approaches to running a CUD action: direct use of EF Core commands, and using DDD methods in the entity class. Let’s look at very simple example of an Update: adding a review in my book app (see http://efcoreinaction.com/).

Note: If you want to try adding a review you can do that. There is a GitHub repo that goes with my book at https://github.com/JonPSmith/EfCoreInAction. To run the ASP.NET Core application then a) clone the repo, select branch Chapter05 (every chapter has a branch) and run the application locally. You will see an Admin button appear next to each book, with a few CUD commands.

Option 1 – direct use of EF Core commands

The most obvious approach is to use EF Core methods to do the update of the database. Here is a method that would add a new review to a book, with the review information provided by the user. Note: the ReviewDto is a class that holds the information returned by the user after they have filled in the review information.

public Book AddReviewToBook(ReviewDto dto)
{
    var book = _context.Books
        .Include(r => r.Reviews)
        .Single(k => k.BookId == dto.BookId);
    var newReview = new Review(dto.numStars, dto.comment, dto.voterName);
    book.Reviews.Add(newReview); 
    _context.SaveChanges();
    return book;
}

The steps are:

Lines 3 to 5: load specific book, defined by the BookId in the review input, with its list of reviews
Line 6 to 7: Create a new review and add it to the book’s list of reviews
Line 8: The SaveChanges method is called, which updates the database.

NOTE: The AddReviewToBook method is in a class called AddReviewService, which lives in my ServiceLayer. This class is registered as a service and has a constructor that takes the application’s DbContext, which is injected by dependecy injection (DI). The injected value is stored in the private field _context, which the AddReviewToBook method can use to access the database.

This will add the new review to the database. It works, but there is another way to build this using a more DDD approach.

Option 2 – DDD-styled entity classes

EF Core offers us a new place to add your update code to – inside the entity class. EF Core has a feature called backing fields that makes building DDD entities possible. Backing fields allow you to control access to any relationship. This wasn’t really possible in EF6.x.

DDD talks about aggregation (mentioned earlier), and that all aggregates should only be altered via a method in the root entity, which I refer to as access methods. In DDD terms the reviews are an aggregate of the book entity, so we should add a review via an access method called AddReview in the Book entity class. This changes the code above to a method in the Book entity, here

public Book AddReviewToBook(ReviewDto dto)
{
    var book = _context.Find<Book>(dto.BookId);
    book.AddReview(dto.numStars, dto.comment, 
         dto.voterName, _context); 
    _context.SaveChanges();
    return book;
}

The AddReview access method in the Book entity class would look like this:

public class Book
{
    private HashSet<Review> _reviews;
    public IEnumerable<Review> Reviews => _reviews?.ToList();
    //...other properties left out
 
    //...constructors left out
 
    public void AddReview(int numStars, string comment, 
        string voterName, DbContext context = null) 
    {
        if (_reviews != null)    
        {
            _reviews.Add(new Review(numStars, comment, voterName));   
        }
        else if (context == null)
        {
            throw new ArgumentNullException(nameof(context), 
                "You must provide a context if the Reviews collection isn't valid.");
        }
        else if (context.Entry(this).IsKeySet)  
        {
            context.Add(new Review(numStars, comment, voterName, BookId));
        }
        else                                    
        {                                        
            throw new InvalidOperationException("Could not add a new review.");  
        }
    }
    //... other access methods left out

This method is more sophisticated because it can handle two different cases: one where the Reviews have been loaded and one where it hasn’t. But it is faster than the original case, as it uses a “create relationship via foreign keys” approach if the Reviews are not already loaded.

Because the access method code is inside the entity class it can be more complex if need be, because its going to be the ONLY version of that code you need to write (DRY). In option 1 you could have the same code repeated in different places wherever you need to update the Book’s review collection.

NOTE: I have written an article called “Creating Domain-Driven Design entity classes with Entity Framework Core” all about DDD-styled entity classes. That has a much more detailed look at this topic. I have also updated my article on how to write business logic with EF Core to use the same DDD-styled entity classes.

Why doesn’t the method in the entity class called SaveChanges? In option 1 a single method contained all the parts: a) load entity, b) update entity, c) call SaveChanges to update the database. I could do that because I knew it was being called by a web action, and that was all I wanted to do.
With DDD entity methods you can’t call SaveChanges in the entity method because you can’t be sure the operation has finished. For instance, if you were loading a book from a backup you might want to create the book, add the authors, add any reviews, and then call SaveChanges so that everything is saved together.

Option 3: the GenericServices library

There is a third way. I noticed there was a standard pattern when using CRUD commands in the ASP.NET applications I was building, and back in 2014 I build a a library called GenericServices, which works with EF6.x. In 2018 I built a more comprehensive version called EfCore.GenericServices for EF Core (see this article on EfCore.GenericServices).

These libraries don’t really implement a repository pattern, but act as an adapter pattern between the entity classes and the actual data that the front-end needs. I have used the original, EF6.x, GenericServices and it has saved me months of writing boring front-end code. The new EfCore.GenericServices is even better, as it can work with both standard styled entity classes and DDD-styled entity classes.

Which option is best?

Option 1 (direct EF Core code) has the least code to write, but there is a possibility of duplication, because different parts of the application may want to apply CUD commands to an entity. For instance, you might have an update via the ServiceLayer when the user via changes things, but external API might not go through the ServiceLayer, so you have to repeat the CUD code.

Option 2 (DDD-styled entity classes) places the crucial update part inside the entity class, so the code going to be available to anyone who can get an entity instance. In fact, because the DDD-styled entity class “locks down” access to properties and collections everybody HAS to use the Book entity’s AddReview access method if they want to update the Reviews collection. For many reasons this is the approach I want to use in future applications (see my article for a discussion on the pros and cons). The (slight) down side is its it needs a separate load/Save part, which means more code.

Option 3 (the EF6.x or EF Core GenericServices library) is my preferred approach, especially now I have build the EfCore.GenericServices version that handles DDD-styled entity classes. As you will see in the article about EfCore.GenericServices, this library drastically reduces the code you need to write in your web/mobile/desktop application. Of course, you still need to access the database in your business logic, but that is another story.

Organising your CRUD code

One good thing about the Rep/UoW pattern is it keeps all your data access code in one place. When swapping to using EF Core directly, then you could put your data access code anywhere, but that makes it hard for you or other team members to find it. Therefore, I recommend having a clear plan for where you put your code, and stick to it.

The following figure shows a Layered or Hexagonal architecture, with only three assemblies shown (I have left out the business logic, and in a hexagonal architecture you will have more assemblies). The three assemblies shown are:

ASP.NET Core: This is the presentation layer, either providing HTML pages and/or a web API. This no database access code but relies on the various methods in the ServiceLayer and BusinessLayers.
ServiceLayer: This contains the database access code, both the query objects and the Create, Update and Delete methods. The service layer uses an adapter pattern and command pattern to link the data layer and the ASP.NET Core (presentation) layer. (see this section from one of my articles about the service layer).
DataLayer: This contains the application’s DbContext and the entity classes. The DDD-styled entity classes then contain access methods to allow the root entity and its aggregates to be changed.

NOTE: The library GenericServices (EF6.x) and EfCore.GenericServices (EF Core) mentioned earlier are, in effect, a library that provides ServiceLayer features, i.e. that act as an adapter pattern and command pattern between the DataLayer and your web/mobile/desktop application.

The point I want make from this figure is, by using different assemblies, a simple naming standard (see the word Book in bold in the figure) and folders, you can build an application where your database code is isolated and it’s easy to find. As your application grows this can be critical.

Unit testing methods that use EF Core

The final part to look at is unit testing applications that use EF Core. One of the pluses of a repository pattern is you can replace it with a mock, when testing. So, using EF Core directly removed the option of mocking (technically you could mock EF Core, but it’s very hard to do well).

Thankfully things have moved on with EF Core and you can simulate the database with an in-memory database. In-memory databases are quicker to create and have a default start point (i.e. empty), so it’s much easier to write tests against. See my article, Using in-memory databases for unit testing EF Core applications, for a detailed look at how you can do that, plus an NuGet package called EfCore.TestSupport that provide methods to make writing EF Core unit tests quicker to write.

Conclusions

My last project that used the Rep/UoW pattern was back in 2013, and I have never used a Rep/UoW pattern again since then. I have tried a few approaches, a custom library called GenericServices with EF6.x, and now a more standard query object and DDD entity methods with EF Core. They are easy to write and normally perform well, but if they are slow it’s easy to find and performance tune individual database accesses.

In the book I wrote for Manning Publications I have a chapter where I performance tune a ASP.NET Core application that “sells” books. That process used query objects and DDD entity methods and shows that it can produce great performing database accesses (see my article Entity Framework Core performance tuning – a worked example for a summary).

My own work follows the query object for reads and DDD-styled entity classes with their access methods for CUD and business logic. I do need to use these in a proper application to really know if they work, but its promising. Wtach this space for more in DDD-styled entity classes, architecture that benefit from that, and maybe a new library :).

Happy coding!

Saturday, June 6, 2020

Advanced Use Cases for the Repository Pattern in .NET

Read later

In our previous article, Implementation Strategies for the Repository Pattern with Entity Framework, Dapper, and Chain, we looked at the basic patterns needed to implement a repository. In many cases these patterns were such a thin layer around the underlying data access technology they were essentially unnecessary. However, once you have a repository in place, many new opportunities become available.

When designing a repository, you should be thinking in terms of “what must happen”. For example, let us say you have a rule that whenever a record is updated, its “LastModifiedBy” column must be set to the current user. Rather than trying to remember to update the LastModifiedBy in application code before every save, you can bake that functionality right into the repository.

Note: where appropriate, this article will include code samples for Entity Framework, Dapper, and/or Tortuga Chain. However, you will find most repository features can be implemented in an ORM-agnostic fashion.By treating your data access layer as a standalone library that manages all of the “must happen” details, you can dramatically reduce implementation errors. At the same time, you can simplify the code that is built on top of the repository, as it no longer needs to be concerned about bookkeeping tasks.

Audit Columns

Most applications eventually find the need to track who made changes to the database and when. For simple databases this takes the form of audit columns. The names vary, but they usually fall into these four roles:

Created by User Key
Created Date/Time
Last Modified by User Key
Last Modified Date/Time

Depending on the security requirements of the application, additional audit columns may be considered such as:

Deleted by User Key
Deleted Date/Time
[Created | Last Modified | Deleted] by Application Key
[Created | Last Modified | Deleted] by IP Address

The date columns are easy enough to handle behind the scenes, but for the user keys you need to do something a bit more interesting. What you need is a “context aware repository”.

Normally repositories are context free, meaning they have no information other than what’s absolutely necessary to connect to the database. When correctly designed, the repository can be entirely stateless, allowing you to share one instance across the whole application.

Context aware repositories are a bit more complex. They cannot be constructed until you know the context, which at the very least includes the currently active user’s id or key. For some applications, this is enough. For others, you may need to pass in an entire user object and/or an object representing the running application.

Chain

Chain has built in support through a concept known as audit rules. Audit rules allow you to specify overrides based on a column name. Out of the box, it includes date-based rules and rules that copy a property from a user object into a column. Here is an example,

dataSource = dataSource.WithRules(
    new UserDataRule("CreatedByKey", "UserKey", OperationType.Insert),
    new UserDataRule("UpdatedByKey", "UserKey", OperationType.InsertOrUpdate),
    new DateTimeRule("CreatedDate", DateTimeKind.Local, OperationType.Insert),
    new DateTimeRule("UpdatedDate", DateTimeKind.Local, OperationType.InsertOrUpdate)
    );

As mentioned above, you are going to need a context aware repository for this to work. In the constructor below you can see how the context is passed to an immutable data source, creating a new data source with the necessary information.

public EmployeeRepository(DataSource dataSource, User user)
{
    m_DataSource = dataSource.WithUser(user);
}

Thus setup, you can leverage your DI framework of choice to automatically create and populate the repository on a per-request basis.

Entity Framework

To globally apply audit columns in Entity Framework, you need to leverage the ObjectStateManager and create a specialized interface. The interface, or base class if you prefer, will look something like this:

public interface IAuditableEntity 
{
    DateTime CreatedDate {get; set;}
    DateTime UpdatedDate {get; set;}
    DateTime CreatedDate {get; set;}
    DateTime CreatedDate {get; set;}
}

The interface (or base class) is then applied to every entity for which the database has matching audit columns.

Next you need to override the Save method of your DataContext class as follows.

public override int SaveChanges()
{
    // Get added entries
    IEnumerable<ObjectStateEntry> addedEntryCollection = Context
        .ObjectContext
        .ObjectStateManager
        .GetObjectStateEntries(EntityState.Added)
        .Where(m => m != null && m.Entity != null);

    // Get modified entries
    IEnumerable<ObjectStateEntry> modifiedEntryCollection = Context
        .ObjectContext
        .ObjectStateManager
        .GetObjectStateEntries(EntityState.Modified)
        .Where(m => m != null && m.Entity != null);

    // Set audit fields of added entries
    foreach (ObjectStateEntry entry in addedEntryCollection)
    {                
        var addedEntity = entry.Entity as IAuditableEntity;
        if (addedEntity != null)
        {
            addedEntity.CreatedDate = DateTime.Now;
            addedEntity.CreatedByKey = m_User.UserKey;
            addedEntity.UpdatedDate = DateTime.Now;
            addedEntity.UpdatedByKey = m_User.UserKey;
        }

    }

    // Set audit fields of modified entries
    foreach (ObjectStateEntry entry in modifiedEntryCollection)
    {
        var modifiedEntity = entry.Entity as IAuditableEntity;
        if (modifiedEntity != null)
        {
            modifiedEntity.UpdatedDate = DateTime.Now;
            modifiedEntity.UpdatedByKey = m_User.UserKey;
        }
    }
    return SaveChanges();
}

If you are going to be working a lot with EF, it really pays to become intimately familiar with the ObjectStateManager and its capabilities. This is where most of the useful metadata about transactions in progress can be found.

Finally, you need to modify the constructor of your data context (and possibly repository) to accept a user object.

While this looks like a lot of code, it only needs to be done once per EF data context. And as with the previous example, the actual creation of the data context and repository can be performed by your DI framework.

History Table

Tracking changes to records is often required due to local laws and regulations. Other times it is desirable simply to make diagnostics easier.

Our general recommendation is to simply allow the database to do this. Some databases have this capability built-in, which is often referred to as a temporal table. Others can emulate it using triggers. In either case, the application is unaware of the additional logging, which makes the technique far less error prone.

If for some reason you cannot use a temporal table or trigger, then the repository needs to explicitly write to the history table.

Regardless of where you put the code that maintains the history table, there are two basic conventions that you can follow. Consistency is really important here, as it would be quite confusing to have one convention for some tables and the other convention for the rest.

Copy before Write: In this convention you copy the old record from live table to the history table prior to performing the update or delete operation. This means the history table never contains the current record. As a result, you’ll need to join the live and history tables together to see a complete history of changes.

Write before Copy: Alternately you can update the live table first, then copy that row to the history table. This has the advantage of having a complete picture in the history table, eliminating the aforementioned join. The downside is it takes more space due to the duplicated data.

With either convention, you’ll want to use soft deletes in order to track who is actually deleting the row. If hard deletes are desired, then they should only be performed following a soft delete.

Soft Deletes

One of the advantages of using a repository is you can switch from hard deletes to soft deletes without the rest of the application realizing it. A soft delete removes the record as far as the application is concerned, but allows it to remain in the database for auditing purposes. Optionally, an application can even undelete records.

To avoid data loss, applications should not be granted DELETE privileges on tables that support soft deletes. If the application accidentally tries to perform a hard delete, the permission check will raise an error instead of silently deleting the row.

Chain

Chain offers implicit soft delete support via the audit rules infrastructure. When configuring a soft delete rule, it is customary to also configure the matching audit columns:

var dataSource = dataSource.WithRules(
    new SoftDeleteRule("DeletedFlag", true, OperationTypes.SelectOrDelete),
    new UserDataRule("DeletedByKey", "EmployeeKey", OperationTypes.Delete),
    new DateTimeRule("DeletedDate", DateTimeKind.Local, OperationTypes.Delete)
    );

Whenever a table is found with a soft delete column (DeletedFlag in this example), two things happen automatically:

All queries implicitly add “AND DeletedFlag = 0” to the WHERE clause.
All calls to DataSource.Delete become update statements to set the deleted flag.

Entity Framework

In Entity Framework, you can include an additional where clause on every query that reads from a table supporting soft deletes. You’ll also need to manually convert any delete operation into an update, which can be tricky when working with object graphs.

Another option takes quite a bit of work, but may be less error prone. It starts by explicitly listing every table that supports soft deletes in the DataContext.OnModelCreating override.

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
   modelBuilder.Entity<Employee>().Map(m => m.Requires("IsDeleted").HasValue(false));
}

You then need to override the Save method to ensure deletes become updates. Colin on Stackoverflow offers this pattern.

public override int SaveChanges()
{
   foreach (var entry in ChangeTracker.Entries()
             .Where(p => p.State == EntityState.Deleted 
             && p.Entity is ModelBase))
    SoftDelete(entry);
    return base.SaveChanges();
}

private void SoftDelete(DbEntityEntry entry)
{
    var e = (ModelBase)entry.Entity;
    string tableName = GetTableName(e.GetType());
    Database.ExecuteSqlCommand(
             String.Format("UPDATE {0} SET IsDeleted = 1 WHERE ID = @id", tableName)
             , new SqlParameter("id", e.ID));

    //Marking it Detached prevents the hard delete
    entry.State = EntityState.Detached;
}

You’ll want to read the rest of Colin’s answer, as there are a lot of edge cases to be addressed.

Access Logging

While audit columns, history tables, and soft deletes cover all of the write scenarios, there are times when you also need to log reads. An example of this is the US health care industry. Any doctor or nurse needs the ability to access any patient’s medical records in the event of an emergency. But in the normal course of business, they are only legally allowed to do so when they are actively treating that patient.

Since the records cannot be fully locked down, the next best thing is to track who is reading each record. This can be easily accomplished at the repository level by logging each query involving sensitive data. This is most easily done manually at the top of the relevant repository methods.

Performance Logging

When user experience is a feature, it is important to know how much time is being spent on a per-query basis. Merely tracking per-page performance isn’t enough, as one page may involve multiple queries. This is especially true of Entity Framework, as lazy-loading can hide database calls.

Explicit Logging in the Repository

Though it is tedious and easy to miss a query, one can simply wrap every query in a disposable timer. The pattern is as follows:

public class OperationTimer : IDisposable
{
    readonly object m_Context;
    readonly Stopwatch m_Timer;

    public OperationTimer(object context)
    {
        m_Context = context;
        m_Timer = Stopwatch.StartNew();
    }
    public void Dispose()
    {
        //Write to log here using timer and context
    }
}

And the usage:

using(new OperationTimer("Load employees"))
{
    //execute query here
}

Chain

Chain exposes a set of events at the data source level. The one needed in this case is DataSource.ExecutionFinished. Here is an example:

static void DefaultDispatcher_ExecutionFinished(object sender, ExecutionEventArgs e)
{
    Debug.WriteLine($"Execution finished: {e.ExecutionDetails.OperationName}. Duration: {e.Duration.Value.TotalSeconds.ToString("N3")} sec. Rows affected: {(e.RowsAffected != null ? e.RowsAffected.Value.ToString("N0") : "<NULL>")}.");
}

You can also attach a handler to DataSource.GlobalExecutionFinished, which listens to events from all data sources.

Entity Framework

The built-in logging capabilities of Entity Framework don’t make it possible to time individual queries. You can work around this limitation using a custom IDbCommandInterceptor.

public class EFLoggerForTesting : IDbCommandInterceptor
{
    static readonly ConcurrentDictionary<DbCommand, DateTime> m_StartTime = new ConcurrentDictionary<DbCommand, DateTime>();

    public void ReaderExecuted(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)
    {
        Log(command, interceptionContext);
    }

    public void NonQueryExecuted(DbCommand command, DbCommandInterceptionContext<int> interceptionContext)
    {
        Log(command, interceptionContext);
    }

    public void ScalarExecuted(DbCommand command, DbCommandInterceptionContext<object> interceptionContext)
    {
        Log(command, interceptionContext);
    }

    private static void Log<T>(DbCommand command, DbCommandInterceptionContext<T> interceptionContext)
    {
        DateTime startTime;
        TimeSpan duration;

        m_StartTime.TryRemove(command, out startTime);
        if (startTime != default(DateTime))
        {
            duration = DateTime.Now - startTime;
        }
else
            duration = TimeSpan.Zero;

        string message;

        var parameters = new StringBuilder();
        foreach (DbParameter param in command.Parameters)
        {
            parameters.AppendLine(param.ParameterName + " " + param.DbType + " = " + param.Value);
        }

        if (interceptionContext.Exception == null)
        {
            message = string.Format("Database call took {0} sec. RequestId {1} \r\nCommand:\r\n{2}", duration.TotalSeconds.ToString("N3"), requestId, parameters.ToString() + command.CommandText);
        }
        else
        {
            message = string.Format("EF Database call failed after {0} sec. RequestId {1} \r\nCommand:\r\n{2}\r\nError:{3} ", duration.TotalSeconds.ToString("N3"), requestId, parameters.ToString() + command.CommandText, interceptionContext.Exception);
        }

        Debug.WriteLine(message);
    }


    public void NonQueryExecuting(DbCommand command, DbCommandInterceptionContext<int> interceptionContext)
    {
        OnStart(command);
    }

    public void ReaderExecuting(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)
    {
        OnStart(command);
    }

    public void ScalarExecuting(DbCommand command, DbCommandInterceptionContext<object> interceptionContext)
    {
        OnStart(command);
    }

    private static void OnStart(DbCommand command)
    {
        m_StartTime.TryAdd(command, DateTime.Now);
    }
}

This doesn’t offer a way to capture contextual data, but you can work around that limitation by shoving the context in ThreadLocal or AsyncLocal as appropriate.

Permission Checks – Table Level

While permission checks can be done at the application level, it is often advantageous to also enforce them in the repository. This eliminates the possibility the permission check is forgotten on newly created screens/pages.

Repository Enforcement

The simplest way to implement this is a role check at the beginning of each relevant function. For example,

public int Insert(Employee employee)
        {
            if (!m_User.IsAdmin)
                throw new SecurityException("Only admins may add employees");

Database Enforced

A more sophisticated option would involve creating multiple connection strings. When the repository is created, a connection string is selected based on the user’s role. In this case, the connection string for non-admin users wouldn’t have INSERT privileges on the employee table.

Due to the complexity and maintenance headaches involved, this approach is not recommended except under very high security environments where multiple layers of defense are desired. Even then, it requires extensive automated testing to ensure every connection string has all the permissions it needs.

Permission Checks – Column Level

Permission checks may also be needed at the column level. For example, you may want to prevent users from giving themselves admin privileges. Or you may want to block non-managers from seeing employee salaries.

Chain

Chain leverages its audit rules capabilities to implement column level permission checks. An anonymous function is passed to the RestrictColumn constructor along with the column name and list of restricted operations. (A table name can be optionally specified as well.)

var IsAdminCheck = user => ((User)user).IsAdmin;

dataSource = dataSource.WithRules(
    new RestrictColumn("Users", "IsAdmin", OperationTypes.Insert|OperationTypes.Update, IsAdminCheck));

To prevent reading of a restricted column, pass in the OperationTypes.Select flag.

Dapper

The easiest way to do this in Dapper is to simply have multiple SQL statements. If the user lacks a specific privilege, you simply select the SQL statement that omits those columns.

Entity Framework

For queries, there are a couple of options available.

Manually create difference projections (i.e. Select clauses) depending on the user’s role
Perform the query normally. Then if the permission check fails, loop through the result set, setting the restricted properties to null/0.

For inserts, simply blank out the restricted properties as above.

Updates are trickier. When restricting writes to individual columns, you cannot attach entities. Rather, you need to re-fetch the original record, copy across the permitted values, and then save that object instead of the one passed in by the application code. (Essentially our “novice” pattern from the previous article.)

Mapping One Model to Multiple Tables

An important concept in data architecture is the idea that you don’t need a one-to-one mapping between tables and classes. In order to make the database work more efficiently or to address a particular business rule, you will often find it advantageous to map one class to multiple tables.

Say, for example, you were tracking baseball teams. You may have these tables:

Table	Primary Key
Team	TeamKey
TeamSeasonMap	TeamKey+SeasonKey

If your application only understands the concept of a team in the context of a season, you may have one Team object that covers both tables.

Chain

In Chain, there isn’t a strong relationship between classes and tables. This means for updates you would write code such as this:

dataSource.Update("Team", myTeam).Execute();
dataSource.Update("TeamSeasonMap", myTeam).Execute();

At runtime it will determine which properties are applicable to which tables and generate the SQL accordingly.

Under this model, you would fetch the Team object from a view that joined both tables. (Chain doesn’t support joins directly and assumes they will always occur via views.)

Entity Framework

Entity Framework expects that multiple tables mapping to a single entity all share exactly the same primary key. This means that it will not support that scenario.

For reads, you can perform the join and projection using EF’s normal LINQ syntax.
For updates, you will need to copy the model into a separate entity for each table.

Caching

Generally speaking, caching is a repository concern. Since the repository knows when data is being altered, it is the best equipped to handle cache invalidation.

Chain

Caching is supported by Chain, but it needs to be applies on a query by query basis using appenders. Appenders are attached to operations before they are executed. In this case there are four appenders we care about:

.Cache(...)
.CacheAllItems(...)
.InvalidateCache(...)
.ReadOrCache(...)

They are best explained by means of an example repository. Here you can see the interplay between caching individual records and caching collections using `CacheAllItems`.

public class EmployeeCachingRepository
{

    private const string TableName = "HR.Employee";
    private const string AllCacheKey = "HR.Employee ALL";

    public IClass1DataSource Source { get; private set; }
    public CachePolicy Policy { get; private set; }

    public EmployeeCachingRepository(IClass1DataSource source, CachePolicy policy = null)
    {
        Source = source;
        Policy = policy;
    }

    protected string CacheKey(int id)
    {
        return $"HR.Employee EmployeeKey={id}";
    }

    protected string CacheKey(Employee entity)
    {
        return CacheKey(entity.EmployeeKey.Value);
    }

    public Employee Get(int id)
    {
        return Source.GetByKey(TableName, id).ToObject<Employee>().ReadOrCache(CacheKey(id), policy: Policy).Execute();
    }

    public IList<Employee> GetAll()
    {
        return Source.From(TableName).ToCollection<Employee>().CacheAllItems((Employee x) => CacheKey(x), policy: Policy).ReadOrCache(AllCacheKey, policy: Policy).Execute();
    }

    public Employee Insert(Employee entity)
    {
        return Source.Insert(TableName, entity).ToObject<Employee>().InvalidateCache(AllCacheKey).Cache((Employee x) => CacheKey(x), policy: Policy).Execute();
    }

    public Employee Update(Employee entity)
    {
        return Source.Update(TableName, entity).ToObject<Employee>().Cache(CacheKey(entity)).InvalidateCache(AllCacheKey).Execute();
    }

    public void Delete(int id)
    {
        Source.DeleteByKey(TableName, id).InvalidateCache(CacheKey(id)).InvalidateCache(AllCacheKey).Execute();
    }
}

As you can see, Chain gives you a lot of control over your invalidation logic at the cost of having to carefully specify everything.

Entity Framework

Entity Framework has two levels of caching. The first level is limited to the data context and is primarily concerned with ensuring the object graph doesn’t have duplication entities that represent the same physical database record. Since this cache is destroyed along with the data context, it is not relevant to most caching scenarios.

In EF terminology, what we’re looking for is known as a “second level cache”. While available in EF 5, version 6 of Entity Framework didn’t ship with any sort of caching out of the box. For this, you’ll need to turn to the third party libraries such as EntityFramework.Cache or EFSecondLevelCache. As you can see from these example libraries, there is no standard pattern for adding second level caching to EF.

Quick reference

React Js

https://easydockers.blogspot.com/2020/08/hands-on-guide-developing-and-deploying.html
Tags: kubernetees, node js, mini cube, mongo db.
https://easydockers.blogspot.com/2020/08/azure-devops-docker-and-angular.html
Tags:

Angular js

Docker

ACR

ASP . NET

https://easydockers.blogspot.com/2020/08/containerize-aspnet-core-applications.html
Tags:

ACR

AKS

Docker compose

https://easydockers.blogspot.com/2020/08/dockerizing-multiple-services.html
Tags:

Angular Js

Dotnet API

Azure functions

https://easydockers.blogspot.com/2020/08/deploy-azure-functions-to-kubernetes.html
Tags: azure funcitons, mini cubes, kubernetees

Azure functions

https://easydockers.blogspot.com/2020/08/deploy-azure-functions-to-kubernetes.html
Tags: java

java

https://dockerpundits.blogspot.com/2021/12/azure-devops-cicd-pipeline-java.html
Google cloud, node JS

Monday, June 8, 2020

Is the repository pattern useful with Entity Framework Core?

TL;DR – summary

The aims of this article

Setting the scene

What people are saying against the repository pattern

My views on the pros and cons of repository/unit-of-work pattern

The good parts of the Rep/UoW pattern (best first)

The bad parts of the Rep/UoW pattern (worst first)

How to use EF Core, but still benefit from the good parts of the Rep/UoW pattern

1. Query objects: a way to isolate and hide database read code.

2. Approaches to handling Create, Update and Delete database accesses

Option 1 – direct use of EF Core commands

Option 2 – DDD-styled entity classes

Option 3: the GenericServices library

Which option is best?

Organising your CRUD code

Unit testing methods that use EF Core

Conclusions

Saturday, June 6, 2020

Advanced Use Cases for the Repository Pattern in .NET

Audit Columns

Chain

Entity Framework

History Table

Soft Deletes

Chain

Entity Framework

Access Logging

Performance Logging

Explicit Logging in the Repository

Chain

Entity Framework

Permission Checks – Table Level

Repository Enforcement

Database Enforced

Permission Checks – Column Level

Chain

Dapper

Entity Framework

Mapping One Model to Multiple Tables

Chain

Entity Framework

Caching

Chain

Entity Framework

Free hosting web sites and features -2024