We had to make sure that the infrastructure storage solutions we were going to develop would be highly effective for developers by addressing the most common patterns first. That analysis led us to three top patterns:
- Key-Value storage. The majority of the Amazon storage patterns were based on primary key access leading to single value or object. This pattern led to the development of Amazon S3.
- Simple Structured Data storage. A second large category of storage patterns were satisfied by access to simple query interface into structured datasets. Fast indexing allows high-speed lookups over large dataset. This pattern led to the development of Amazon SimpleDB. A common pattern we see is that secondary keys to objects stored in Amazon S3 are stored in SimpleDB, where lookups result in sets of S3 (primary) keys.
- Block storage. The remaining bucket holds a variety of storage patterns ranging special file systems such as ZFS to applications managing their own block storage (e.g. cache servers) to relational databases. This category is served by Amazon EBS which provides the fundamental building block for implementing a variety of storage patterns.
Amazon has also provided details in regards to pricing, durability, and performance. Highlights include:
- Volumes can be between 1GB and 1TB in size.
- Volumes behave like raw unformatted block devices.
- Access is limited to within the same availability zone similar to a SAN in a data center.
- A volume can only be attached to one EC2 instance at a time.
- One EC2 instance can have several attached volumes.
- Volumes can have snapshots backed up to S3. Snapshots are incremental with only changed data.
- Due to data replication, complete volume failure is expected to be 0.1% - 0.5% based on volume size compared to 4% for commodity hard disks.
- Pricing is $0.10 per allocated GB and $0.10 per million I/O requests.
...As a point of reference, our main database server is pretty busy and chugs along at an average of 17 transactions per second, which should total to around $4.40 per month. But our monitoring servers, prior to some recent optimizations, hammered the disks as fast as they would go at over 1000 random writes per second sustained 24×7. That would end up costing over $250 per month! As far as I can tell, for most situations the EBS transaction costs will be in the noise, but you can make it expensive if you’re not careful...
Finally, GigaOM provides a business analysis of the new offering noting that traditional data centers should be worried.