After carrying out a number of benchmarks, Microsoft concluded that .NET offers better performance and cost-performance ratio than WebSphere. IBM rebutted Microsoft’s findings and carried out other tests proving that WebSphere is superior to .NET. Microsoft responded by rejecting some of IBM’s claims as false and repeating the tests on different hardware with different results.
Summary
Microsoft has benchmarked .NET and WebSphere and published the benchmark source code, run rules, use rules and a findings report published at wholoveswindows.com entitled Benchmarking IBM WebSphere 7 on IBM Power6 and AIX vs. Microsoft .NET on HP BladeSystem and Windows Server 2008. This benchmark shows a much higher transactions per second (TPS) rate and better cost/performance ratio when using WebSphere 7 on Windows Server 2008 over WebSphere on AIX 5.3, and even better results when using .NET on Windows Server 2008 over WebSphere on the same OS. The cost/performance ratio for the application benchmark used is:
IBM Power 570 with WebSphere 7 and AIX 5.3 | HP BladeSystem C7000 with WebSphere 7 and Windows Server 2008 | HP BladeSystem C7000 with .NET and Windows Server 2008 |
$32.45 | $7.92 | $3.99 |
IBM has rebutted Microsoft’s benchmark and called some of their claims as false, and performed a different benchmark, with different results. The benchmark used along with the findings were published in Benchmarking AND BEATING Microsoft’s .NET 3.5 with WebSphere 7! (PDF). The source code of the benchmark was not published. The results show WebSphere as a better performing middle-tier than .NET with 36% more TPS for one application benchmark and from 176% to 450% better throughput for one of IBM’s standard benchmarks.
Microsoft responded to IBM and defended their claims and benchmarking results with Response to IBM’s Whitepaper Entitled Benchmarking and Beating Microsoft .NET 3.5 with WebSphere 7 (PDF). Microsoft has also re-run their benchmark, modified to include a different test flow similar to the one used by IBM in their tests, running it on different hardware, a single multi-core server, founding that indeed WebSphere is better than .NET if using IBM’s test flow but only slightly better, between 3% and %6, not as reported by IBM. Besides that, these later findings do not change the original ones since the benchmark was run on a different hardware configuration. In the end, Microsoft invites IBM to “an independent lab to perform additional testing”.
Microsoft Testing .NET Against WebSphere
Microsoft has conducted a series of tests comparing WebSphere/Java against .NET on three different platforms. The details of the benchmarks performed and the test results were published in the whitepaper entitled Benchmarking IBM WebSphere® 7 on IBM® Power6™ and AIX vs. Microsoft® .NET on Hewlett Packard BladeSystem and Windows Server® 2008 (PDF).
Platforms tested:
- IBM Power 570 (Power 6) running IBM WebSphere 7 on AIX 5.3
- 8 IBM Power6 cores at 4.2GHz
- 32 GB RAM
- AIX 5.3
- 4 x 1 GB NICs
- Hewlett Packard BladeSystem C7000 running IBM WebSphere 7 on Windows Server 2008
- 4 Hewlett Packard ProLiant BL460c blades
- One Quad-Core Intel® Xeon® E5450 (3.00GHz, 1333MHz FSB, 80W) Processor/blade
- 32 GB RAM/blade
- Windows Server 2008/64-bit/blade
- 2 x 1 GB NICs/blade
- Hewlett Packard BladeSystem C7000 running .NET on Windows Server 2008
- Same as the previous one but the applications tested run on .NET instead of WebSphere.
A number of three tests were performed on each platform:
- Trade Web Application Benchmarking
The applications tested were IBM’s Trade 6.1 and Microsoft’s StockTrader 2.04. This series of tests have evaluated the performance of complete data-driven web applications running on top of the above mentioned platforms. The web pages accessed had one or usually more operations serviced by classes contained by the business layer and ending with synchronous database calls. - Trade Middle Tier Web Services Benchmarking
This benchmark was intended to measure the performance of the Web Service layer executing operations which ended up in database transactions. The test was similar to Web Application, but operations were counted individually. - WS Test Web Services Benchmarking
This test was like the previous one but there was no business logic nor database access. This was based on WSTest workload initially devised by Sun and augmented by Microsoft. The services tier offered 3 operations: EchoList, EchoStruct and GetOrder. Having no business logic, the test measured only the raw performance of the Web Service software.
Two database configurations were used, one for the all-IBM platform and another for the other two: IBM DB2 V9.5 Enterprise Edition with IBM DB2 V9.5 JDBC drivers for data access and SQL Server 2008 databases Enterprise Edition. Two databases were set up for each configuration running on HP BL680c G5 blades:
- 4 Quad-Core Intel XEON CPUs, @2.4GHZ (16 cores in each blade)
- 64 GB RAM
- 4 x 1GB NICs
- IBM DB 9.5 Enterprise Edition 64-bit or Microsoft SQL Server 2008 64-bit
- Microsoft Windows Server 2008 64-bit, Enterprise Edition
- 2 4GB HBAs for fiber/sans access to the EVA 4400 storage
The storage was secured on HP StorageWorks EVA 4400 Disk Array:
- 96 15K drives total
- 4 logical volumes consisting of 24 drives each
- Database server 1: Logical Volume 1 for logging
- Database server 1: Logical Volume 2 for database
- Database server 2: Logical Volume 3 for logging
- Database server 2: Logical Volume 4 for database
The Web Application benchmark used 32 client machines running test scripts. Each machine simulated hundreds of clients having a 1 second think time. The tests used an adapted version of IBM’s Trade 6.1 application on SUT #1 & #2 and Microsoft’s StockTrader application on SUT #3.
For the Web Service and WSTest benchmarks, Microsoft used 10 clients with a 0.1s think time. For WSTest, the databases were not accessed. Microsoft has created a WSTest-compliant benchmark for WebSphere 7 and JAX-WS and another in C# for .NET using WCF.
Microsoft’s whitepaper contains more details on how the tests were conducted including the DB configuration, DB access used, caching configuration, test scripts, tuning parameters used and others.
Conclusion
The benchmarking results including the costs/performance ratio are shown in the following table:
IBM Power 570 with WebSphere 7 and AIX 5.3 | HP BladeSystem C7000 with WebSphere 7 and Windows Server 2008 | HP BladeSystem C7000 with .NET and Windows Server 2008 | |
Total Middle-Tier System Cost | $260,128.08 | $87,161.00 | $50,161.00 |
Trade Web Application Benchmark | 8,016 TPS | 11,004 TPS | 12,576 TPS |
Cost/Performance | $32.45 | $7.92 | $3.99 |
Trade Middle Tier Web Service Benchmark | 10,571 TPS | 14,468 TPS | 22,262 TPS |
Cost/Performance | $24.61 | $6.02 | $2.25 |
WSTest EchoList Test | 10,536 TPS | 15,973 TPS | 22,291 TPS |
Cost/Performance | $24.69 | $5.46 | $2.25 |
WSTest EchoStruct Test | 11,378 TPS | 16,225 TPS | 24,951 TPS |
Cost/Performance | $22.86 | $5.37 | $2.01 |
WSTest GetOrder Test | 11,009 TPS | 15,491 TPS | 27,796 TPS |
Cost/Performance | $23.63 | $5.63 | $1.80 |
According to Microsoft’s benchmarking results, running WebSphere on HP BladeSystem with Windows Server 2008 is about 30% more efficient and the cost-performance ratio is 5 times lower than running WebSphere on IBM Power 570 with AIX 5.3. The .NET/Windows Server 2008 configuration is even more efficient and the cost/performance ratio drops to half compared to WebSphere/Windows Server 2008 and it is 10 times smaller than WebSphere/Power 570/AIX. The cost-performance ratio is so high for the first platform because the price of the entire middle-tier is over $250,000 while the performance is lower than the other platforms.
Microsoft’s benchmarking whitepaper (PDF) contains an appendix with complete details of the hardware and software costs. The benchmarking tests used, including source code, are published on StockTrader website.
IBM’s Rebuttal
In another paper, Benchmarking AND BEATING Microsoft’s .NET 3.5 with WebSphere 7! (PDF), IBM has rejected Microsoft’s benchmark and created another one showing that WebSphere is performing better than .NET.
Microsoft had said that StockTrader is similar to IBM’s Trade application:
Microsoft created an application that is functionally equivalent to the IBM WebSphere Trade application, both in terms of user functionality and middle-tier database access, transactional and messaging behavior.
IBM rejected Microsoft’s claim:
The application claims to be “functionally equivalent” to the IBM WebSphere Trade 6.1 sample application. It is not a “port” of the application in any sense. Little, if any, of the original application design was ported. Microsoft has made this an application that showcases the use of its proprietary technologies. A major indication of this is the fact that the .NET StockTrader application is not a universally accessible web application since it can only be accessed by using Internet Explorer, and not by other web browsers.
Furthermore, IBM said that Trade was not designed to benchmark WebSphere’s performance but rather to
serve as a sample application illustrating the usage of the features and functions contained in WebSphere and how they related to application performance. In addition, the application served as a sample which allowed developers to explore the tuning capabilities of WebSphere.
IBM had other complaints regarding Microsoft’s benchmark:
Microsoft created a completely new application [StockTrader] and claimed functional equivalence at the application level. The reality is that the Microsoft version of the application used proprietary SQL statements to access the database, unlike the original version of Trade 6.1 which was designed to be a portable and generic application.
They employed client side scripting to shift some of the application function to the client.
They tested Web Services capabilities by inserting an unnecessary HTTP server between the WebSphere server and the client.
And If that was not enough, they failed to properly monitor and adjust the WebSphere application server to achieve peak performance.
IBM’s Competitive Project Office team (CPO) has ported StockTrader 2.0 to WebSphere creating CPO StockTrader and claiming: “we did a port that faithfully reproduced Microsoft’s application design. The intent was to achieve an apples-to-apples comparison.” So, Trader 6.1 was ported by Microsoft from WebSphere to .NET under the name StockTrader and ported again by IBM back to WebSphere under the name CPO StockTrader. IBM benchmarked CPO StockTrader against StockTrader and obtained better results for WebSphere against .NET:
IBM has also informed they are using Friendly Bank, an application intended to benchmark WebSphere against .NET. In this test WebSphere outperforms .NET several times:
In their StockTrader vs. CPO StockTrader benchmark, IBM used scripts simulating user activity: “login, getting quotes, stock buy, stock sell, viewing of the account portfolio, then a logoff” and running in stress mode without think times. 36 users were simulated, enough to drive each server at maximum throughput and utilization. The data returned was validated and errors were discarded.
The front end was implemented with WebSphere 7/Windows Server 2008 in one case and .NET 3.5 with IIS 7/Windows Server 2008 in the other. The back end database was DB2 8.2 and SQL Server 2005, both on Windows Server 2003.
The hardware used for testing was:
Performance Testing Tool Hardware
X345 8676 Server
2 X 3.06 GHz Intel Processor with Hyper Thread Technology
8 GB RAM
18.2 GB 15K rpm SCSC Hard Disk Drive
1 GB Ethernet interface
Application Server Hardware
IBM X3950 Server, 8 x 3.50 Ghz, Intel Xeon Processors with Hyper Thread Technology, 64 GB RAM
Database Server Hardware
X445 8670 Server, 8x 3.0 Ghz. Intel Xeon Processors with Hyper Thread Technology, 16 GB RAM
UltraSCSI 320 Controller , EXP 300 SCSI Expansion Unit, 14x 18.2 GB 15K rpm Hard Disk Drive configured as 2 Raid Arrays.
One for Logs & One for Database, Each array is comprised of 7 hard disks in a Raid 0 configuration.
The Ethernet Network Backbone
The isolated network hardware is comprised of 3x 3Comm SuperStack 4950 switches and one 3 Comm SuperStack 4924 switch running at 1 GB.
The software and hardware configuration for the Friendly Bank benchmark was similar to the StockTrader one.
IBM’s whitepaper contains information about the Friendly Bank application, but does not point to the source code. It also mentions that the application was initially designed for .NET Framework 1.1 and was just recompiled on .NET 3.5 without being updated to use the latest technologies.
Microsoft Response to IBM’s Rebuttal
Microsoft has responded to IBM’s rebuttal in yet another whitepaper, Response to IBM’s Whitepaper Entitled Benchmarking and Beating Microsoft .NET 3.5 with WebSphere 7 (PDF). In this document, Microsoft defends their original benchmarking results and affirms that IBM made some false claims in their rebuttal document entitled Benchmarking AND BEATING Microsoft’s .NET 3.5 with WebSphere 7!, and IBM failed to use an appropriate benchmarking procedure. More has been posted at wholoveswindows.com.
Basically, Microsoft said the following claims are false:
- IBM claim: The .NET StockTrader does not faithfully reproduce the IBM Trade application functionality.
Microsoft response: this claim is false; the .NET StockTrader 2.04 faithfully reproduces the IBM WebSphere Trade application (using standard .NET Framework technologies and coding practices), and can be used for fair benchmark comparisons between .NET 3.5 and IBM WebSphere 7.- IBM claim: The .NET StockTrader uses client-side script to shift processing from the server to the client.
Microsoft response: this claim is false, there is no client-side scripting in the .NET StockTrader application.- IBM claim: The .NET StockTrader uses proprietary SQL.
Microsoft response: the .NET StockTrader uses typical SQL statements coded for SQL Server and/or Oracle; and provides a data access layer for both. The IBM WebSphere 7 Trade application similarly uses JDBC queries coded for DB2 and/or Oracle. Neither implementation uses stored procedures or functions; all business logic runs in the application server. Simple pre-prepared SQL statements are used in both applications.- IBM claim: The .NET StockTrader is not programmed as a universally accessible, thin-client Web application. Hence it runs only on IE, not in Firefox or other browsers.
Microsoft response: In reality, the .NET StockTrader Web tier is programmed as a universally accessible, pure thin client Web application. However, a simple issue in the
use of HTML comment tags causes issues in Firefox; these comment tags are being updated to allow the ASP.NET application to properly render in any industry standard browser, including Firefox.- IBM claim: The .NET StockTrader has errors under load.
Microsoft response: This is false, and this document includes further benchmark tests and Mercury LoadRunner details proving this IBM claim to be false.
Also, Microsoft complained that IBM had developed Friendly Bank for .NET Framework 1.1 years ago using obsolete technologies:
IBM’s Friendly Bank benchmark uses an obsolete .NET Framework 1.1 application that includes technologies such as DCOM that have been obsolete for many years. This benchmark should be fully discounted until Microsoft has the chance to review the code and update it for .NET 3.5, with newer technologies for ASP.NET, transactions, and Windows Communication Foundation (WCF) TCP/IP binary remoting (which replaced DCOM as the preferred remoting technology).
Microsoft considered IBM failed by not providing the source code for CPO StockTrader and Friendly Bank applications and reiterated the fact that all the source code for Microsoft’s benchmark applications involved in this case had been made public.
Microsoft also noticed that IBM had used a modified test script which “included a heavier emphasis on buys and also included a sell operation”. Microsoft re-performed their benchmark using IBM’s modified test script flow, one including the operations Buy and Sell beside Login, Portfolio, Logout, on a single 4-core application server affirming that
these tests are based on IBM’s revised script and are meant to satisfy some of these IBM rebuttal test cases as outlined in IBM’s response paper. They should not be considered in any way as a change to our original results (performed on different hardware, and different test script flow); as the original results remain valid.
The test was carried on:
Application Server(s) | Database(s) |
1 HP ProLiant BL460c 1 Quad-core Intel Xeon E5450 CPU (3.00 GHz) 32 GB RAM 2 x 1GB NICs Windows Server 2008 64-bit .NET 3.5 (SP1) 64-bit IBM WebSphere 64-bit |
1 HP ProLiant DL380 G5 2 Quad-core Intel Xeon E5355 CPUs (2.67 GHz) 64 GB RAM 2 x 1GB NICs Windows Server 2008 64-bit SQL Server 2008 64-bit DB2 V9.7 64-bit |
The result of the test shows similar performance for WebSphere and .NET.
One of IBM’s complaints had been that Microsoft inserted an unnecessary HTTP web server in front of WebSphere reducing the number of transactions per second. Microsoft admitted that, but added:
The use of this HTTP Server was fully discussed in the original benchmark paper, and is done in accordance with IBM’s own best practice deployment guidelines for WebSphere. In such a setup, IBM recommends using the IBM HTTP Server (Apache) as the front end Web Server, which then routes requests to the IBM WebSphere Application server. In our tests, we co-located this HTTP on the same machine as the Application Server. This is equivalent to the .NET/WCF Web Service tests, where we hosted the WCF Web Services in IIS 7, with co-located IIS 7 HTTP Server routing requests to the .NET application pool processing the WCF service operations. So in both tests, we tested an equivalent setup, using IBM HTTP Server (Apache) as the front end to WebSphere/JAX-WS services; and Microsoft IIS 7 as the front end to the .NET/WCF services. Therefore, we stand behind all our original results.
Microsoft performed yet another test, the WSTest, without the intermediary HTTP web server on a single quad-core server like the previous one, and obtained the following result:
Both tests performed by Microsoft on a single server show WebSphere holding a slight performance advantage over .NET but not as much as IBM pretended in their paper. Besides that, Microsoft remarked that IBM did not comment on middle-tier cost comparison which greatly favors Microsoft.
Microsoft continued to challenge IBM to
meet us [Microsoft] in an independent lab to perform additional testing of the .NET StockTrader and WSTest benchmark workloads and pricing analysis of the middle tier application servers tested in our benchmark report. In addition, we invite the IBM competitive response team to our lab in Redmond, for discussion and additional testing in their presence and under their review.
Final Conclusion
Generally, a benchmark consists of
- a workload
- a set of rules describing how the workload is to be processed – run rules -
- a process trying to ensure that the run rules are respected and results are interpreted correctly
A benchmark is usually intended to compare two or more systems in order to determine which one is better for performing certain tasks. Benchmarks are also used by companies to improve their hardware/software before it goes to their customers by testing different tuning parameters and measuring the results or by spotting some bottlenecks. Benchmarks can also be used for marketing purposes, to prove that a certain system has better performance than the competitor’s.
In the beginning, benchmarks were used to measure the hardware performance of a system, like the CPU processing power. Later, benchmarks were created to test and compare applications like SPEC MAIL2001 and even application servers like SPECjAppServer2004.
There is no perfect benchmark. The workload can be tweaked to favor a certain platform, or the data can be misinterpreted or incorrectly extrapolated. To be convincing, a benchmark needs to be as transparent as possible. The workload definition should be public, and if possible the source code should be made available for those interested to look at. A clear set of run rules are mandatory so other parties can repeat the same tests to see the results for themselves. The way results are interpreted and their meaning must be disclosed.
We are not aware of a response from IBM to Microsoft’s last paper. It would be interesting to see their reaction. Probably, the best way to clear things up is for IBM to make the source code of their tests public so anybody interested could test and see for themselves where is the truth. Until then we can only speculate on the correctness and validity of these benchmarks.