Tuesday, August 18, 2009

A Pet Shop Anti Benchmark?

Anyone who has been involved with the many "PetShop" related dramas over the years must have had a difficult time not to develop anti-bodies to the whole idea. However, like a fool who just doesn't know how to stop, I now return to this concept hoping to revive it yet again. No, don't run, please hear me out...there's a bit of a twist to it - I actually think I may have had a bit of an idea!

As far as I can see, historically all the problems have really sprung from the same root: That different players tried to use PetShop implementations to demonstrate that their platforms were faster/better.

"Well, duh", you might say to that :-)

Before I start sounding like I'm heaping blame to the left and right on PetShop contributors, I want to humbly note that if blame there should be, then I am one of those to be blamed. I wrote one such PetShop using Pragmatier (the O/R Mapper I had written, I think it may have been the first O/R Mapping based PetShop for .NET - it was right around the time of the theserverside drama linked at the top of this post) and we tested it in a big lab and published a report ( http://www.devx.com/vb2themax/Article/19867 ) - all, of course, with the express hope that it would show potential customers what a splendid and fast product we offered.

So I want to underline how it would indeed seem very human to be driven towards implementing a PetShop (it takes a bit of time) because one hopes it will demonstrate something positive in terms of the competitive benefits of one's framework. Indeed, that's why I did it.

But while we can go on for a long while detailing why this approach has so many drawbacks that it may be ultimately doomed to collapse in a black hole of bad PR, I am suddenly more intrigued by the notion that perhaps a PetShop benchmark could be very useful after all - but if used in exactly the opposite way of how they have been used so far:

What about a PetShop benchmark that compared the performance of a persistence layer using different O/RMs versus a plain ADO one in order to show how small the difference is between frameworks (and between any framework and plain ADO) in terms of overall application performance? The goal of such a site would be not to compete about who is actually fastest, but to clearly demonstrate how much of overall application performance that is eaten up by the O/RMs on average (and of course show the caps given by the worst and best numbers as well) - and if my predictions will turn out correct, that this average is likely to be convincingly small.

Of course, one could then go on to study the result details to see that mapper X is the one that makes the application go marginally faster than mapper Y - or the results could be anonymized, but I don't think they would have to be: I seriously think that if done in a realistic way, the outcome on a viewer would be the very opposite of wanting to look up who was actually fastest. (*)

The overview would show (I think) that of the total application execution time, all the interesting mappers take up a similar, pretty small percentage, and if so the conclusion would become that it doesn't matter if mapper X or Y is the fastest - they are all fast enough, and the question instead becomes which mapper fits your style of work, existing tool sets, has good documentation etc etc.

Now, as far as I can see, the only mappers who wouldn't want to contribute their mapper to such a benchmark are the ones who 1) are so horribly slow that they would be off the chart, or 2) have speed as their main selling point, since the benchmark would primarily show their selling point to be largely irrelevant.

The rest of the mappers would benefit from jointly demonstrating that speed isn't really the differentiating factor between mappers (and almost never a reason to discourage use of mappers in general), and that thus the client can have the luxury of making their choice based on things like nice documentation, or API suitability, or relevant feature sets...and mapper makers in their turn will have the benfit of being able to convincingly demonstrate just why there is credible room for several players on the market (they all cater to different tastes and prioritize features differently).

In a sense, the idea might be better described as an "anti-benchmark" than a real benchmark, since (if I am right) it would be used mainly to demonstrate that the benchmark itself is meaningless (which, in a paradoxial way, still is very meaningful to demonstrate).

But most importantly, such a benchmark could give us a conclusive answer to the age old question: Does the speed of an (non-dog-slow) O/RM matter?

(*) Further against anonymous results: there would of course still be a small segment of applications where any small performance benefit really would be the deciding factor, and why shouldn't they be able to find the info about who is fastest and why shouldn't the fastest framework be able to have at least those customers coming to it? seems fair to me. See above for why I think any framework spending a large effort on extreme performance optimization can probably use those customers, especially if a benchmark such as this one would go live...

2 comments:

  1. Hey Mats, its Mike formerly from Avanade Netherlands (I'm the the American). Didn't know you had a new blog up....looks good!

    I saw your posting on Ayende regarding that ORMBattle situation. What a mess that is...Alex seems like a very bright guy, but he's got to understand the point I think you make in this blog posting...or at least the point you appear to be hinting at. The point being that there are no really useful benchmarks for an ORM.

    My opinion is that comparing speed differences with ORMs is useless because you are always comparing apples to oranges. Modern ORMs all implement different features, so you can't compare speed in tests unless you can also point out why the speed difference, which with a mature ORM will almost always be "because of feature X". In Oren's case, that O(N) performance is due to dirty checks needing to be performed due to the session flushes. The dirty checks are O(N) because NHibernate's default behavior is not to notify on data changes in the entities...because NHibernate has the great feature of supporting POCO by default.

    So, thats why the performance difference. However, this cannot be explained adequately in Alex's tables and charts, so it goes unknown by the casual observer...hence Frans' and Oren's frustration, I think.

    I agree with them on this. Raw performance numbers are useless without a complete list of features. For example, I could write my own stripped down version of ADO.NET which makes calls straight to the database, call it an ORM, and be at the top of Alex's scores!

    As for your point here, which I think is that in the context of a real application, it is possible (likely) that the choice of an ORM will not impact performance very much, I think I would guess this too. I could choose one of many ORMs available, and unless one had some kind of O(N^2) (or worse) bug somewhere, I don't suspect that the application performance would be different than if I chose another ORM in the list and implemented my application logic similarly. The bottlenecks would, as always, be in the processes outside of the .NET space...network speed, disk speed, DB indexing, etc...

    ReplyDelete
  2. Hi Mike,

    How are you doing, good to hear from you! :-)

    I agree with everything you say except possibly that in the specific case of these tests a very barebones O/RM might not do so well (you need a few features to do good in some of the tests) but in principle you are absolutely right there too.

    ReplyDelete