BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Interview with Entity Modelling Tool Creator, Frans Bouma

Interview with Entity Modelling Tool Creator, Frans Bouma

Key Takeaways

  • 3rd party ORMs on .NET have been driven into a niche market and most of them lost that battle. If you compare the list of ORMs from, say, 2008, vs. today you'll see there aren't many left. 
  • Doing the change tracking inside the entity itself has many advantages, one being that you can have a stand-alone unit of work object.
  • Something every developer of an ORM with a LINQ provider has found out: with a LINQ provider you're never done. There are always issues popping up due to e.g. unexpected expressions in the tree.
  • Stored procedures can be very helpful in application code when you have to consume large sets of data in a piece of code: you don't have to first transport the data out of the database.
  • For basic CRUD, stored procedures can be a maintenance problem leading to an explosion of stored procedures.

Our first .NET interview of the year is with Frans Bouma of LLBLGen Pro. This tool has been around for almost as long as .NET itself, but being a commercial product it isn’t as well-known as the free alternatives.

InfoQ: LLBLGen Pro isn't just an ORM, it is an "entity modeling solution". What does that actually mean?

Frans Bouma: From the start LLBLGen Pro has consisted of two parts: the LLBLGen Pro Designer and the LLBLGen Pro Runtime Framework. You model your (abstract) entity model in the Designer, which converts that model into source code on one side and a relational database schema on the other side. The runtime framework fits between source and relational database schema and does what every ORM does: convert instances from one side into instances of the other side.

The two parts together form a convenient system to get from an abstract entity model to working source code, usable to access and work with a relational database.

InfoQ: Why did you design LLBLGen Pro to use not only its own ORM, but the ORMs from other vendors such as EF and NHibernate?

Frans Bouma: A couple of reasons, really. Our ORM, the LLBLGen Pro runtime framework, isn't a POCO framework, its entity classes inherit from base classes defined in the runtime. While this has many advantages, some people want to work with POCO classes and not have their entity classes derive from classes they don't own. Because of this, they wouldn't use our ORM. By opening our designer for other ORMs, the users of these frameworks now are able to use the designer to model their entity models instead of doing it all by hand. If we had kept our designer solely for our own ORM, we'd miss these customers.

Another reason was that for NHibernate and also Entity Framework there weren't really any designer tools worth using, while many users of these ORMs want to use a designer. By offering support for these frameworks in our designer they can use a full featured modeling system and still use the ORM they want (or have to) use.

The elephant in the room-reason of course is also that Microsoft pushes Entity Framework as 'the' framework to use and makes it look like there's no other ORM around. This has made EF the most widely used ORM on .NET. The side effect of that has been that 3rd party ORMs on .NET have been driven into a niche market and most of them lost that battle. If you compare the list of ORMs from, say, 2008, vs. today you'll see there aren't many left. With support for EF, and also NHibernate and Linq to SQL, we can still exist as an ISV while competing with free ORMs, two of which are installed on every developer's machine.

InfoQ: The next version of Entity Framework, EF Core, has completely abandoned its designer in favor of only supporting code-first entities. A big reason for this was that the massive XML file it used didn’t play well with source control, especially when merging branches. How does the LLBLGen designer deal with these issues?

Frans Bouma: Merge conflicts in mapping files are actually the same as in a code-first using code base, however they tend to look more scary because the merge conflicts are in XML and not in a C# file. Merging conflicting edits to the same C# file tends to be easier because we understand the C# code in both edits and can reasonably judge which side to pick. In the XML of an EDMX file this is much harder as things are tied to each other with complex sounding elements of which the reader doesn't know what they do. For our project format I tried to keep things as simple and as straightforward as possible: if you read the xml of the project file, you can understand what each element does and what it's for. Furthermore the ordering and naming of the model elements themselves are done in such a way that conflicts are kept to a minimum.

InfoQ: How does LLBLGen Pro Runtime Framework differ from other ORMs such as Entity Framework and NHibernate?

Frans Bouma: Every ORM has its unique set of features and a set of common features. One of the most prominent differences between LLBLGen Pro Runtime Framework and all the others is that it does the change tracking inside the entity class instances and therefore doesn't need a central context or session object (the old Scott Ambler design of an ORM). Doing the change tracking inside the entity itself has many advantages, one being that you can have a stand-alone unit of work object. This allows you to track work and changes to the in-memory entity graph with the stand-alone unit of work object which you can then pass to the persistence core. That will have no problem determining what you want: there's no conflict about whether these entities are new, updated or e.g. you want them deleted, that information is inside the unit of work and the entities.

This design offers a friction-free decoupling of entity classes and persistence, which can't be provided with the Ambler model of an ORM: the central session/context class always has to be told what to do with a graph of entity class instances. A good example is the add/attach API mess in Entity Framework.

Features aside, it's over 10 times faster than EF and NHibernate in entity scenarios which is also nice to have ;)

InfoQ: EF’s add/attach API is definitely a problem, especially since we’ve switched from rich clients that can hold a context open to REST style services where the objects are recreated from incoming JSON. Can you show a brief example of performing an insert/update in a service environment?

Frans Bouma: As change tracking is done inside the entities, this data is serialized/deserialized with them across the wire if you use them in a .NET client. This means that when  you receive the entities back, you can simply persist the graph: new entities will be inserted, updated ones will updated. No micro-management required. Here's an example of a method receiving a customer entity that might refer to order entities in its 'Orders' collection, which might contain references to other entities as well:

public void StoreChanges(CustomerEntity myCustomer)
{
   using(var adapter = new DataAccessAdapter())
   {
      adapter.SaveEntity(myCustomer);
   }
}

That's it.

Of course, you can also use a Unit of work on the client, and send that over:

public void StoreChanges(UnitOfWork2 uow)
{
   using(var adapter = new DataAccessAdapter())
   {
      uow.Commit(adapter);
   }
}

This second example separates persistence from the unit of work management one wants in the application itself. The '2' suffix might be something people will raise an eyebrow over: our framework supports two paradigms: selfservicing (with the persistence methods on the entities themselves, like myCustomer.Save(), and supports lazy loading) and adapter, of which you see an example above). Adapter was introduced after selfservicing and at the time I didn't want to tie the different interfaces and classes for Adapter to the name 'Adapter' so I used a numeric suffix. This was a decision I'd like to have done differently, but alas, once released you stick with it to avoid putting massive breaking changes on the shoulders of your customers. So I kept it.

InfoQ: An increasingly important part of security is auditing. Where in the past it was enough to simply record who last touched a record, these days we often need a full history of changes and even a chronicle of who has been reading the data. Would you care to talk about LLBLGen Pro handles authorization and auditing?

Frans Bouma: LLBLGen Pro Runtime Framework has built-in auditing and authorization support. This is done through auditor and authorizer classes, of which you can inject an instance at runtime into an entity instance automatically. Additionally, if you want to do it through overriding some methods instead, you can opt for that. The auditor object in an entity gets called by the framework on various actions, like when a field is read or written to, when an entity is saved or deleted. The developer is free in what to do at that point, including creating new entities, e.g. entities mapped to tables for audit data. At the end of a transaction, the participating entities are asked whether they have entities to persist for audit information, which is the point where these newly created entities are saved.

Authorization works in the same way: the authorizer object in the entity gets a call from the framework when a given action needs authorization, e.g. a field is read or an entity is about to be materialized. The developer again is free in the way how to authorize the request, and simply has to return a boolean value which illustrates the outcome of the authorization. If the authorization failed, the data read is seen as 'null' or 'void'.

This all works transparently to the developer: the authorizer and auditor class (as well as validators) are written outside the entity classes, once and can be injected through the built-in dependency injection feature, or by other DI frameworks, or by overriding a method. Using the entity classes you won't have to worry about auditing or authorization on the data level at all.

InfoQ: Does the LLBLGen designer or ORM include support for generating database tables?

Frans Bouma: The LLBLGen Pro Designer is of course capable of creating table DDL SQL for you: it's a key part of the model-first functionality. You model your entity model and then synchronize it with the relational model data in your project using a single button click. The designer will then update your relational model data based on the changes in the entity model. After that you can generate a DDL SQL update script to update your real database schema with the changes made.

It is done this way -using DDL SQL scripts- because in the real world where real DBAs rule over the database servers, migrations are tested first, using scripts. It might not be a big deal with a new project started from scratch, but once things are in production (the majority of the lifetime of an application!), migrations are carefully tested before being rolled out. Having changes in DDL SQL script form is ideal in those situations. This is also the reason why the runtime doesn't create tables for you: there's no need. You can generate a DDL SQL create script from the designer, if you need to start from scratch. However in general, there are other elements part of the database schema too, which aren't creatable from a model: views, Table Valued Functions, stored procedures. So a fresh install of an app often starts with the DDL SQL create script created from the full schema with views and the other artifacts, crated from e.g. SSMS.

InfoQ: One of the features unique to LLBLGen Pro Runtime Framework is QuerySpec. What prompted to you create this alternative to normal LINQ expression trees?

Frans Bouma: During most of 2008 I spent writing a full Linq provider for our runtime framework and the end result, I thought, was OK. However as every developer of an ORM with a linq provider has found out: with a linq provider you're never done. There are always issues popping up due to e.g. unexpected expressions in the tree. The main reason is that the expression tree acts like an AST but there's no formal grammar it originates from. So there's no defined set of rules which define the transformation of input in the formal grammar into an AST you have to rewrite using visitors to a desired output, there's just an expression tree which means 'something'.

An additional problem with Linq is that it doesn't map 1:1 to SQL. In some areas it does but in other areas you have to interpret the expression tree and transform it into SQL matching the intent of what the developer tries to achieve. The overall complexity of interpreting expression trees is overwhelming, which makes a Linq provider a very complex system. Couple that with the infinite set of input trees and you get a piece of software that is very hard to get bug-free.

From a user's perspective Linq is also complex once you get passed the 'from x in name' queries: the user has a hard time figuring out what the SQL will look like (e.g. will it perform or do the right thing?) or the other way around: they know the SQL they want to execute but have a hard time converting that back to Linq.

These reasons made me realize we needed an alternative query system. Our runtime already had one, our low-level API which was the query API since v1.0 in 2003, but it's quite verbose. With extension methods I built a fluent query system which stays very close to SQL's flow and at the same time was easier to deal with at runtime. It solves the problem for the user how to write the query if they know the SQL (it maps 1:1 to SQL, with some higher level methods that are optional, like an Any() call). For us it solved the problem of unexpected trees to deal with as the system is straightforward. It took less than 2 months to write.

InfoQ: Can you give some examples of things one can do with QuerySpec that aren’t available in say, EF’s LINQ provider?

Frans Bouma: One of the things that's not possible in Linq is to provide a specific ON clause for a join: in linq the joins are always 'a=b' predicates. If you want to do an 'A JOIN B ON A.F > B.F' join clause, you can't. In Queryspec you can: the On clause is simply a predicate expression: it can be whatever you want. Additionally, you can do full joins, right joins and left joins without hard to grasp constructs.

Queryspec is still a high-level query API, so it's not like jOOQ where every SQL keyword is made available to you in .NET: it abstracts away the SQL statement details without the scoping and paradigm shift problems.

InfoQ: What are your thoughts on stored procedures versus generating queries solely in application code?

Frans Bouma: Now there's a can of worms if I ever saw one! More than 10 years ago, raving debates were held about the pros and cons of stored procedures, and as a developer of an ORM I of course participated in these debates, saying stored procedures are overrated and shouldn't be used. That's of course a bit of a blanket statement which fit the debate back then, but is too black-and-white to be real advice. Stored procedures can be very helpful in application code, namely in the situation where you have to consume large sets of data in a piece of code: to perform that piece of code as close to the data as possible gives a huge advantage: you don't have to first transport the data out of the database into a different realm, but can perform the code directly onto the data itself. Additionally, SQL as a set-oriented language is a better fit to express set-oriented logic than in C# with a bunch of imperative statements.

That said, for basic CRUD, like insert X, update X, delete X, stored procedures can be a maintenance problem, as any action required by the application code has to be implemented in the stored procedure API: making changes to that API is difficult, as the stored procedure might be called by other code as well. This often leads to an explosion of stored procedures: the stored procedure needing the change is simply copied and altered, avoiding the breaking change for whatever code was using the original.

The dynamic nature of updates is another reason why stored procedures tend to be less ideal than SQL that has been generated on-the-fly: which fields are updated is only known at runtime, making the SQL statement to generate more efficient than simply updating every field with a generic stored procedure. Graph fetches / eager loading is also hard to do with stored procedures as these too are dynamic in nature.

The LLBLGen Pro designer happily lets you include stored procedures in your model which are then generated as methods you can call. This works regardless of whether you work model first or database first. So you can model your entity model using the model first -> tables functionality and sync the stored procedures with the schema in your database at the same time.

InfoQ: Have you given any thought to support NoSQL style databases with the LLBLGen designer?

Frans Bouma: Yes, since v5 we have support for 'Derived Models', which are models defined on top of your entity model and which consists of derived elements you can see as Documents or DTOs. These derived elements are hierarchical and derived from an entity and can contain denormalized fields from e.g. related derived elements, similar to how one would design the documents for a document NoSQL database.

The designer generates .NET code for these derived models which can then be used with RavenDB, MongoDB or any other Document database, like Microsoft's DocumentDB. The idea behind this is that the document models used in nosql databases are actually derived from abstract entity models. By defining them as such in the LLBLGen Pro designer, the changes made to the entity model flow through to the derived models defined on top of it and therefore there's a theoretical basis for the document definitions used at runtime with the document databases: something that's not present elsewhere as there's no schema (besides the POCO serialized into json).

At the moment the entity model still requires a relational mapping but this will change in the near future.

About the Interviewee

Frans Bouma is the creator and lead developer of LLBLGen Pro, a leading ORM and entity modeling solution for .NET. He has been a professional software engineer over 22 years. Since 2003 he works full time on LLBLGen Pro and another project of his company Solutions Design: ORM Profiler. Before that he worked on various projects using a wide range of technologies, spanning from VB to C++. In his spare time he likes to take in-game screenshots and program the necessary tools for that: from shaders using HLSL to cinematic tools using C++ and x86/x64 assembler.

Rate this Article

Adoption
Style

BT