Tim Bray, in his discussions about grid computing before it became such a hot topic, pointed out how advances in hardware around RAM and networking were allowing for the creation of RAM clusters that were faster than disk clusters.
[M]emory is several orders of magnitude faster than disk for random access to data (even the highest-end disk storage subsystems struggle to reach 1,000 seeks/second). Second, with data-center networks getting faster, it’s not only cheaper to access memory than disk, it’s cheaper to access another computer’s memory through the network. As I write, Sun’s Infiniband product line includes a switch with 9 fully-interconnected non-blocking ports each running at 30Gbit/sec; yow! The Voltaire product pictured above has even more ports; the mind boggles. (If you want the absolute last word on this kind of ultra-high-performance networking, check out Andreas Bechtolsheim’s Stanford lecture.)Tim also pointed out the truth of the second part of Gray's statement: "For random access, disks are irritatingly slow; but if you pretend that a disk is a tape drive, it can soak up sequential data at an astounding rate; it’s a natural for logging and journaling a primarily-in-RAM application."
Now flash forward two years and we find that the trend in hardware advances has continued for RAM and network and stayed slow for disk. Bill McColl talked about massive memory systems becoming available for parallel computing:
Memory is the new disk! With disk speeds growing very slowly and memory chip capacities growing exponentially, in-memory software architectures offer the prospect of orders-of-magnitude improvements in the performance of all kinds of data-intensive applications. Small (1U, 2U) rack-mounted servers with a terabyte or more or memory will be available soon, and will change how we think about the balance between memory and disk in server architectures. Disk will become the new tape, and will be used in the same way, as a sequential storage medium (streaming from disk is reasonably fast) rather than as a random-access medium (very slow). Tons of opportunities there to develop new products that can offer 10x-100x performance improvements over the existing ones.Dare Obsanjo pointed out how not paying attention to the mantra can have detrimental effects, a la Twitter's issues. Commenting on Twitter's content management-like implementation, Obsanjo said "The problem is that if you naively implement a design that simply reflects the problem statement then you will be in disk I/O hell. It won't matter if you are using Ruby on Rails, Cobol on Cogs, C++ or hand coded assembly, the read and write load will kill you." In other words, push the random-access operations into RAM and only use disk for sequential operations.
Tom White, a committer on Hadoop Core and a member of the Hadoop Project Management Committee, went into more detail on the "disk is the new tape" part of Gray's quote. In discussing the MapReduce programming model, White pointed out why disk is still viable as a application data storage medium for tools like Hadoop:
In essence MapReduce works by repeatedly sorting and merging data that is streamed to and from disk at the transfer rate of the disk. Contrast this to accessing data from a relational database that operates at the seek rate of the disk (seeking is the process of moving the disk's head to a particular place on the disk to read or write data). So why is this interesting? Well, look at the trends in seek time and transfer rate. Seek time has grown at about 5% a year, whereas transfer rate at about 20%. Seek time is growing more slowly than transfer rate - so it pays to use a model that operates at the transfer rate. Which is what MapReduce does.While it remains to be seen if Solid State Drives (SSD) will change the seek/transfer ratios, many commenters to White's discussion thought that they may be a leveling factor in the RAM/hard drive debate.
Nati Shalom gave a well reasoned discussion on how memory and disk play into database deployment and usage for MySQL. Shalom highlighted the limitations of database clustering and database partitioning as means to provide performance and scale saying "The fundamental problems with both database replication and database partitioning are the reliance on the performance of the file system/disk and the complexity involved in setting up database clusters." His offered solution was to go with an In-Memory Data Grid (IMDG), backed by technologies like Hibernate 2nd level cache or GigaSpaces Spring DAO, to provide Persistence as a Service for your applications. Shalom explained IMDGs saying they
provide object-based database capabilities in memory, and support core database functionality, such as advanced indexing and querying, transactional semantics and locking. IMDGs also abstract data topology from application code. With this approach, the database is not completely eliminated, but put it in the *right* place.The primary benefits of an IMDG over direct RDBMS interaction listed were:
- relies on memory which is significantly faster and more concurrent than file systems
- Data can be accessed by reference
- Data manipulation is performed directly on the in-memory objects
- Reduced contention for data elements
- Parallel aggregated queries
- In-process local cache
- Avoid Object-Relational Mapping (ORM)