InfoQ

News

Amazon Brings Virtualized Storage to the Cloud with Elastic Block Storage

Posted by Scott Delap on Aug 21, 2008 12:04 PM

Community
Architecture
Topics
Cloud Computing ,
Virtualization
Tags
EC2
In April of this year Amazon CTO Werner Vogels announced the development of persistent storage for Amazon EC2. This has long been an achilles heel of the EC2 platform. Server instances startup with the contents of their parent image. Upon server failure/restart the disk reverts back to the original image definition. Today Amazon moved to address this issue with the release of Elastic Block Storage (EBS). Vogels outlines how the offering completes Amazon's suite of common storage patterns:
We had to make sure that the infrastructure storage solutions we were going to develop would be highly effective for developers by addressing the most common patterns first. That analysis led us to three top patterns:
  • Key-Value storage. The majority of the Amazon storage patterns were based on primary key access leading to single value or object. This pattern led to the development of Amazon S3.
  • Simple Structured Data storage. A second large category of storage patterns were satisfied by access to simple query interface into structured datasets. Fast indexing allows high-speed lookups over large dataset. This pattern led to the development of Amazon SimpleDB. A common pattern we see is that secondary keys to objects stored in Amazon S3 are stored in SimpleDB, where lookups result in sets of S3 (primary) keys.
  • Block storage. The remaining bucket holds a variety of storage patterns ranging special file systems such as ZFS to applications managing their own block storage (e.g. cache servers) to relational databases. This category is served by Amazon EBS which provides the fundamental building block for implementing a variety of storage patterns.

Amazon has also provided details in regards to pricing, durability, and performance. Highlights include:

  • Volumes can be between 1GB and 1TB in size.
  • Volumes behave like raw unformatted block devices.
  • Access is limited to within the same availability zone similar to a SAN in a data center.
  • A volume can only be attached to one EC2 instance at a time.
  • One EC2 instance can have several attached volumes.
  • Volumes can have snapshots backed up to S3. Snapshots are incremental with only changed data.
  • Due to data replication, complete volume failure is expected to be 0.1% - 0.5% based on volume size compared to 4% for commodity hard disks.
  • Pricing is $0.10 per allocated GB and $0.10 per million I/O requests.
Given this pricing it is estimated that a medium size database with 100GB of storage would cost $10 in storage and $26 in usage costs. A tutorial is available for running MySQL with EBS. Right Scale has written an overview providing further analysis of the specifications that includes a number of best practices and formulas for cost estimation. In regards to I/O rates they provide the following practical experience:

...As a point of reference, our main database server is pretty busy and chugs along at an average of 17 transactions per second, which should total to around $4.40 per month. But our monitoring servers, prior to some recent optimizations, hammered the disks as fast as they would go at over 1000 random writes per second sustained 24×7. That would end up costing over $250 per month! As far as I can tell, for most situations the EBS transaction costs will be in the noise, but you can make it expensive if you’re not careful...

Finally, GigaOM provides a business analysis of the new offering noting that traditional data centers should be worried.

No comments

Reply

Exclusive Content

The Maxine VM

Bernd Mathiske discusses Maxine VM, Java compatibility, swapping major VM components, research areas, Object handling, code examples, optimizing compiler, snippets, bytecode generation, JNI and JIT.

Joe Armstrong About Erlang

Joe Armstrong speaks on various aspects of the Erlang language, presenting its roots, how it compares with other languages and why it has become popular these days.

The Limits of Code Optimization: a new Singleton Pattern Implementation

The java double-check singleton pattern is not thread safe and can’t be fixed. In this article, Dr. Alexey Yakubovich provides an implementation of the Singleton pattern that he claims is thread-safe.

Pressure and Performance – The CTO's Dilemma

Diana and Jim talk about patterns observed in CTOs' activity. CTOs emerge as real people caring for other people in their organization, and are put under a lot of pressure and constraints.

Biztalk Services in the Cloud

Cloud computing feels like a tomorrow technology. Simon Thurman shows how developers can use Biztalk to create an Internet Service Bus which can be deployed locally or in the cloud.

Java FX Technology Preview

InfoQ takes a look at the JavaFX preview build and talks to Sun Staff Engineer Joshua Marinacci about the upcoming version 1 release expected this autumn.

Jeff Sutherland: Reaching Hyper-Productivity with Outsourced Development Teams

Jeff Sutherland, co-creator of Scrum, and Guido Schoonheim, CTO of Xebia, present an actual case of reaching hyper-productivity with a large distributed team using XP and Scrum.

Steven "Doc" List About Open Spaces

In this interview made by InfoQ's Greg Young, Steven "Doc" List talks about Open Space conferences, a way of running meetings of groups of various sizes by facilitating self organizing the sessions.