Opinions about ORMs vary widely. Some see it as an essential tool for simplifying data access while others claim it greatly increases complexity while robbing applications of much needed performance. InfoQ has interviewed Sam Saffron and Rob Conery who have found a middle ground in what's known as a Micro ORM.
Sam Saffron is a developer for Stack Overflow and has been working with SQL Server since version 7.
Rob Conery is a former Microsoft employee and co-founder of Tekpub.
InfoQ: For high performance scenarios do you believe that it is better to hand-write the SQL? Or we should we be looking at ways to have the ORM optimize the SQL much like a compiler optimizes machine code?
Rob Conery: Absolutely. If you have a site like StackOverflow, every single point of optimization becomes incredibly important. This is assuming that you've already optimized the overall response time of the site as much as possible (using a CDN, GZip, etc). One of the top issues with ORMs is that they tend to mangle SQL calls routinely - and getting your ORM to write precisely what you need your database to read is a bit of an art. If the SQL is really important - why let an ORM write it? If you work directly with the SQL then you and your DBA can optimize the call to precisely represent what it is you need, when you need it. If you use an ORM it's likely you don't have a DBA or they'll be completely in the dark.
Sam Saffron: High performance is a wide question, you could have a "high performance" scenario where you are selecting a single record from a database using a trivial WHERE clause, abstractions can usually handle these scenarios fine (if they are designed properly).
However, stuff get much hairier when you need to leverage database vendor specific features, such as hints, ansi sql extensions and so forth. There is no ORM on the market that allows you the same fidelity as SQL for a specific DB.
Additionally, complex queries often get totally mangled and messed up by ORMs. Performance tweaking SQL is an art, I don't anticipate any ORM getting to the state it can consistently out perform an expert. Also "compiled by ORM" SQL is what DBA nightmares are made of.
InfoQ: With stored procedures the “N+1 selects problem” can be addressed by returning multiple result sets and having the data access logic handle the parent/child relationships. For those using your ORM, what recommendations do you have for handling this?
Rob: SPs are fine if you have the chops/personnel to maintain them - but having multiple result sets isn't the best way to handle the N+1 issue in my mind (you can also use a SQL call to do this if you like - an SP isn't necessarily needed). The reader will keep the connection open longer and for a high-perf site, that can spell tons of trouble. A better solution is to query what you need, when you need it. It used to be the case that a connection was rather expensive - that's not the case anymore since .NET 2.0. I don't want to make light of another connection, but if you balance your benchmarks with both solutions (multiple results vs. second query) you might be surprised at just how close they are. So to get back to your question - no I don't see this as a solution to N+1. The parent/child idea is an OO-focus; I say "put your head in the DB" and understand joins and how they work - and the data that comes back. It's likely you could just do a Left Outer and get the things back that you need.
Sam: Dapper has a very flexible Mult-Mapper, it allows you to map a single row returned from your db to multiple objects which have a parent child relationship. For example: you could select all orders with their owners in a single query and have it map to a Order which has an Owner property that is a User. The multi-mapper has null handling and allows for a very flexible way of splitting up rows.
Additionally we support passing in an list of numbers, strings and so forth, automatically converting it to a multi parameterized query. Eg. cnn.Query("select * from Posts where Ids in @Ids", new {Ids = new[]
{1,3,4}); // selects records 1,3 and 4 in one go.
Note, parameterized SQL supports a fairly high number of incoming params (on SQL Server the upper bound is around 3000)
We also have a grid reader, that allows you to process multiple recordsets returned from the db, which can reduce round trips.
On top of that I would recommend a good profiler like our mvc mini profiler http://code.google.com/p/mvc-mini-profiler/ , N+1 happens, you need to be able to catch it.
InfoQ: For libraries such as ORMs there is always a challengein balancing simplicity with features, too much in either direction makes thelibrary difficult to use. Where do you see your ORM compared to the others?
Rob: The "features" are the things that always end up killing developers and loading up your issue list. I'm aggressively pushing people to rethink the way they access the database and the amount of abstraction cruft they have to get through to just pull data out of the database. SQL is the best DSL for working with data - learn it and love it and do yourself a favor by not fluffing up the experience with features that will ultimately cause you a headache. Front-end developers learned that if they embraced HTML/CSS/Javascript they can create an incredibly compelling UI - why can't we do the same for a database? Also - we're at the point now with C# that we can make the tough stuff look very simple. Just because a tool like Massive simple doesn't mean it doesn't have a kick.
Sam: Dapper is only focused on the job of mapping parametrized SQL to objects efficiently. I intend support this task fairly comprehensively. Many other ORMs, do not solve this problem completely or efficiently. I see Dapper as a very good building block for fancier ORMs.
InfoQ: Long term are you looking more towards simplifying your ORM or would you rather work on increasing its feature set?
Rob: I've mused with adding a few features here and there - but honestly I like it the way it is. For instance - just the other day I was thinking it would be nice to return a typed result set - a "Query<T>" if you will. But I stopped myself, remembering that type conversion was SubSonic's number 1 issue - and still is. Guids, tinyints, bits, int64 and longs - it's a complete headache. There's a joy when working with dynamic languages - types are kicked to the curb and told to shut up and get out of the way - which removes a lot of friction. Less friction, more fun. This is usually the point where developers of statically-typed languages get snotty and say "well - those other languages are just toys". All you have to do is look at the work going on in those communities (Ruby/Python) to see that the toys, apparently, are leading the way when it comes to building highly-satisfying business sites.
So no - I'm not looking to add any features just yet. My main thinking is that "this is a single file" - if you want some features, layer them on in your own app and have a good time!
Sam: I would like to keep it lean in the core, there are a few features I still need to build (multi mapping for 3,4 objects and multi recordset support) however once those are done I do not see Dapper getting too many new features.
That said, I would like to build a vendor specific DSL for querying Sql Server that uses dapper as its query execution engine.
InfoQ: Some people are expressing concern that ORMs mix too much storage logic with the business logic. What are your thoughts about that claim?
Rob: ORMs mix too much everything with what should be a solved problem. To this very day no one is sure where to put their "business logic". Does it go into the model? Do I pop my model into its own project? A service layer? Controllers? If you sit down a group of .NET devs and ask this question you will start a fight. You can complicate your life all you like - the chances are that the app you're building could be 10 times simpler if you focus with razor-precision on the problem at hand rather than all the "what-ifs". For instance - I am also building up a migration tool (Manatee) and it was recommended to me that I use Enums provided by System.Data to describe the data types for the columns to be created. The reasoning here was that "it would be platform independent". For 99% of .NET developers - database platform independence is not a concern (they use SQL Server), but if I wanted Manatee to be "correct" I should build that feature in. I think this is insanity. Adding code to for a what-if is what got us here in the first place - what-ifs aren't defined very well by definition and that's where all the bugs tend to congregate. If you want to support Oracle (for instance) it would take less code and time just to make a "Manatee_Oracle" page (it's all of 120 lines of code).
There is no answer - .NET developers have thrashed on this issue for a long time and will continue to do so.
Sam: There is an impedance mismatch between Databases and the applications object model. It is unavoidable. In general ORMs try to solve this problem and introduce tons of hooks and factories and identity maps, and so forth, to compensate and keep everyone happy. This approach often leads to business logic leaking into ORMs and making things significantly harder to maintain. In the worst cases you have bits of business logic "trapped" in a 3rd party ORM you can not debug into.
InfoQ: Where do you see ORM technology five years from now?
Rob: Probably right where it is now. The tools will version and make things a bit easier from the UI side, but all in all we have to use databases and programmers like to write code to solve problems - which will most likely perpetuate the problem (impedance mismatch issues). Also - the larger platforms out there (like .NET, Java) - their developer base is using technology that is 4-5 years behind the curve. They will be struggling with many of the issues we're trying to solve today (and failing at). It's almost like a black hole - or as Ted Neward somewhat inappropriately coined it: "The Vietnam of Computer Science". He wrote that article 7 years ago - and we're still dealing with all the same issues so I'm not all that bullish on a change for the big platforms.
For other platforms (Rails/Python) - that will be interesting. ActiveRecord/ActiveRelation are pretty powerful - but they have all the same problems that other ORMs do (N+1, etc). They're a far sight easier to use - but there really isn't much more you can do to improve this situation in my mind - other than to stop falling into the wORM hole (hey - that's catchy!) - and embracing SQL at a reasonable level.
Sam: I don't know, much of the recent innovation was driven by new language features in C# (such as dynamics and simpler IL generation with expressions), I would assume that additional new language features may drive some more innovation.
My hope is that in 5 years all the big player ORMs in the C# space are open source. Also, I hope that we build a more modular ORM stack that would allow people to mix and match different ORM pieces from different libraries. (EG. use dapper + FancyQuery + FancyMigration)
The web has Rack. It would be nice if ORMs had a common base library for query execution and mapping.
About the Author
Jonathan Allen has been writing news report for InfoQ since 2006 and is currently the lead editor for the .NET queue. If you are interested in writing news or educational articles for InfoQ please contact him at jonathan@infoq.com.