When a database administrators think of high performance data loading, they are thinking of bulk operations, a feature noticeably lacking in Entity Framework. But that doesn’t have to be the case. We recently spoke with Jonathan Magnan of ZZZ Projects about their new offerings.
InfoQ: Developers can already tell Entity Framework to upload a bunch of records all at once. So why are the Bulk Operations needed at all?
Jonathan Magnan: Simple: for HUGE performance gains.
Imagine asking someone to give you a book one page at a time (Entity Framework) instead of giving you the whole book (Bulk Operations). One technique is obviously way faster than the other: Bulk Operations outperforms Entity Framework by far.
ZZZ Projects offers 2 kinds of bulk operations via the Entity Framework Extensions Library. They increase drastically your performance over the SaveChanges method from Entity Framework.
BulkSaveChanges
The first way is with our main feature, the BulkSaveChanges method which literally upgrades the SaveChanges method. You can expect to save your entities a minimum of 10-15 times faster with thousands of them. This method supports all kinds of associations and entity types (TPC, TPH, and TPT).
Bulk Operations
The second way is with Bulk Operations methods (BulkDelete, BulkInsert, BulkUpdate and BulkMerge) which increase even more the performance and allows customization for many settings like: what's the primary key to use?
InfoQ: In terms of the underlying SQL, how do Bulk Operations differ from normal Entity Framework operations?
Jonathan: Entity Framework performs a round trip to the database for every record it saves. If you have 1,000 entities to update, you will have 1,000 round trips to the database that will execute an update statement and you’ll sit there for a few seconds. On the other hand, with the Entity Framework Extensions Library, it’ll be done in the blink of an eye
We can resume the standard workflow for SQL Server as followed:
- Create a temporary table in SQL Server.
- Bulk Insert data with .NET SqlBulkCopy into the temporary table.
- Perform a SQL statement between the temporary table and the destination table.
- Drop the temporary table from the SQL Server.
The number of hits to the database is drastically reduced.
InfoQ: Do you have any benchmark comparing normal and Bulk Operations for Entity Framework?
Jonathan: The benchmark displays rounding numbers since it is pretty obvious that Bulk Operations will always be way faster than making multiple operations.
Nb. Entities
SaveChanges
BulkSaveChanges
BulkOperations
1,000
1,000ms
90ms
70ms
2,000
2,000ms
150ms
110ms
5,000
5,000ms
350ms
220ms
In a scenario with millions of records, Bulk Operations is the only viable solution. Comparing it to the performance you get with Entity Framework means hours and even days of time saved!
Nb. Entities
BulkSaveChanges
BulkOperations
100,000
7s
4.5s
1,000,000
75s
45s
10,000,000
750s
450s
In a real context scenario, the performance gap between Entity Framework and Bulk Operation is even greater as the number of columns, indexes, triggers and server loads increases.
Furthermore, speed is not the only important factor, spamming your SQL Server with thousands of calls is never a good idea. Even if your application isn’t suffering much from this low performance speed, you may impact other applications performances.
InfoQ: What are the downsides of using bulk operations? For example, is there a minimum number of rows needed to make it useful?
Jonathan: The downside is the second startup load. Our library, like Entity Framework, needs to gather and cache information the first time you use it based on entities and their relations.
Even with one row, we do as well as Entity Framework. We change our saving strategy depending on the number of rows you need to save.
InfoQ: Do you plan on contributing your Entity Framework Bulk Operations to the main Entity Framework branch on CodePlex?
Jonathan: Unfortunately no, the SqlBulkCopy has been created more than a decade ago and the .NET Framework still doesn’t support bulk delete, update and merge. We’re offering these methods and many more features on our website http://www.zzzprojects.com.
InfoQ: What are your thoughts about Microsoft turning over Entity Framework to the .NET Foundation?
Jonathan: I think it’s a great move by Microsoft since the open source community expands very fast. It’s always interesting as a programmer to see the evolution of a project and how people from Microsoft are coding. It allows people who frequently use Entity Framework to easily share their codes, suggestions and ideas. It’s an innovative turn that we hope can only be positive for Microsoft and the .NET community.