For the last couple of years what we've been doing at Amazon is opening up our own computing infrastructure to outside developers. So the same very reliable, very scalable technologies that we use for our own applications are now available to developers for things like storage and messaging and computing and database storage as well. They are all open and accessible to developers on basically a pay-as-you-go basis.
We have SimpleDB for structured storage, we have S3 for more block storage, EC2 is the compute cloud, SQS for messaging. We have the Flexible Payment Service for point-to-point money transfer and DevPay is a way to take other services and put your own business model around those services.
3. Which services are you finding are the most popular right now?
Interestingly enough, what happens is that developers start with one and then they realize that these services, although you can get totally good value out of an individual one, you can start stacking them up and having them play off against each other. And now we're starting to see that neat cloud level architectures happen, where we hear from entrepreneurs that they've built an entire website in the cloud where they'll literally say: "We've got this very very complex system put together and we don't own any of our own servers." We love to hear that from developers where their capital costs are essentially zero and everything that they build is in the cloud and scalable. If their business takes off and they need lots of servers they are going to pay accordingly; if their business stays at a moderate level then they don't incur any cost for services they reserve but didn't use.
A couple of different ways. First, there is a common authentication service that goes across all the different services. So once you create your Amazon developer account, you use the same private key, public key mechanism to access those services. The services are running inside the same datacenter so there is no charge for bandwidth between services inside the datacenter. So a good example is if you have data stored in S3, you're going to pull over to EC2 for processing, do all that processing, send it back to S3 -- that bandwidth back and forth doesn't cost us, so we don't charge developers for that bandwidth. So then a common payment mechanism is another common aspect of the services, and then finally what happens is developers will then put these together.
A very common architecture is to use the Simple Queuing Service as the messaging between different parts of a scalable app. So a very common one I talk about all the time is Podango and their podcast processing. So they have a number of different functional units running on different EC2 instances and then each different kind of functional unit, be it a transcoding or assembly or different kinds of processing, each of those is driven by a separate queue and there is one or more EC2 instances pulling off of each queue. If a queue is getting too busy, if it's taking too much time to do work they can simply ramp up the number of EC2s working on a given queue, makes it very easy to scale. They can be functional at a very very low level of processing -- if they only have a few podcasts per hour they can process that; if they get thousands or tens of thousands, the queues are going to get a little bit bigger, the system automatically senses how busy the queues are and then scales up in response.
5. How did Amazon, which started off as an online web store, get into the cloud computing service?
That's a good question. We actually saw that developers would be a good new business for us. Our first main business, of course, is the retail site. The second business is other companies listing and selling their own products in the Amazon Marketplace. Then we saw... the first web service we rolled out over five years ago was access to Amazon product catalogue, and we saw that developers really took to that service right away, brought a lot of energy and innovation and creativity to that and we've realized pretty quickly, I'd say within weeks of rolling out that service, that developers would be a great new customer base for us. Since then we've been looking into ways that we can provide developers with these infrastructure services, so they can build their own applications.
Right now there are two separate storage services. There is the Amazon S3, the Simple Storage Service, and the concept there is unindexed block storage. So you store a block of data into S3, it's anywhere between 1 byte up to 5 gigabytes at a time. We don't scan, index or otherwise look inside the data that you have there and actually when you store the data you assign a key; you then retrieve it by key as well. So that is your block-structured storage model. The other one is called SimpleDB; it's a more complex model where you store items that are essentially rows of items and each item has any number of attribute value/pairs associated with it. In that case we do index each of the attribute values that you put in there, you can then do keyed retrieval off of those and you can say: "Give me all items with attributes within a certain range or attributes equal to a certain value.
7. Can you give us examples of domains where we can use one or the other?
One common use case actually is where people will use S3 to store large blocks of data and they'll use the simple database as the index into that data, so they'll put the metadata for the block data into the SimpleDB.
Good question, I haven't really looked at it in depth. It's really interesting I think to look at several different companies trying to solve similar problems and say: "We need to build something very very scalable, very available, straightforward to use" and coming I think to relatively the same... I wouldn't say that, feature for feature they compare, but if you look at them at a granular level then you see there is a lot of similarities between the two approaches. We think that there's a huge marketplace out there for these kinds of services and I am pretty sure that we are going to have competition against any one of the given services.
We don't have a specific date or a timeframe to do that, but certainly as we travel all around the world we do hear from people in all different places that say: "We want EC2 and we want it near us." And so we are really taking all those different things into account as we make our plans. One concrete example I can say is that, based on the number of trips I made to the UK last year, I heard from people in the UK and from other parts of Europe as well, but primarily from the UK that they said: "We love S3, but we want some storage that is closer to us if possible" and so late last year we rolled out Amazon S3 in Europe and we've been really happy with how that's been going.
I talked before about the authentication that happens as part of each different web service request. So each request that comes into a service, be that an anonymous request or something that was signed with a private key and is available only to an individual user, those each run through a common authentication service. That's to check, say, "Is this person actually authorized to make a request to the service?" So it turned out that over time the actual access pattern, the number of authenticated request that we received, it kind of stepped up a little bit more than we were expecting and while we had a number of different monitoring systems in place to check the request levels and the final results, that one piece of authentication we didn't have a monitoring system in place for, and we were caught a little bit unaware when that system became overloaded. So we learned that there was a piece of monitoring that we didn't have that we needed and we put that into place and so that certainly will not happen again. We've also heard quite a bit from our customers that say we need to do a little bit more to be open and transparent if possible when things do go wrong. So we are putting some things into place as well to make sure that we can communicate better. One thing we are talking about is putting up a service dashboard that will show developers the actual realtime status of each service so that they have no question, they can go to that dashboard and there will just be a single point they can look at and just see how each service is actually doing in real time.
11. And what can you say about what we can expect from Amazon in the future?
Not a whole lot unfortunately. One thing I can say, this sounds cliché, but it 100% true, is that we are very focused on actually listening to customers and taking what we hear from our customers and turn that into services as quickly as possible. That sometimes can take the form of entire brand-new services and it can often also take the form of additional features for particular services. So S3 in Europe is a great thing I can point to, and I can point to individual meetings I had with customers last year and say: "Because I heard from you and you and you and you", all those inputs put together with some market research turned into a new service, and I can point to other individual features and say: "These services happened because we heard from customers" and one of the disciplines we do within my team is, as we travel the world, we have our conference presentations and our developer meetings. We listen a lot and we take a lot of notes and before we end the day we send a trip report back to the company which then actually goes directly to the whole developer relations team and to the leaders of all the individual service teams and that gives them a direct ear into what's happening out in the field. So we do our best to listen to that feedback and get it back into the company as efficiently as possible.
It's actually somewhat of a common misconception that we launched this simply because we had servers sitting around and we wanted to find something for them to do. It's always been the case that we launched this specifically because we wanted to bring something of value to developers. It's been going really well for us. We don't give out individual numbers but in Amazon metrics we always said we wanted things to be up and to the right, where we look at a graph day by day of service usage and of revenue and we want to see that graph nicely ticking upward and we are really happy with where it's going, we love the fact that we've got all these developers out in the community building great things. Our community is about 330,000 developers now and we just love to see all the amazing different kinds of things that they build.
That is actually the first time I've heard that particular request, but if that turns out to be the best way to solve a customer problem and if it's a problem that we think enough customers have, I would have to say that is something we could do. I am not committing to anything, of course, but the customer input and the customer feedback is... I know this totally sounds cliché but we're a customer-driven company and the things that we hear from our customers that say: "This is what we need and this is how we need something in order to do our job better", those are the most valuable inputs we can possibly get as a company and if we get developers coming to us and saying, "I need a stored procedure model in order to use SimpleDB more effectively", we would look and say:"OK, how do we best meet that set of customer needs?" Now, to date I haven't heard people asking for that particular thing but it's still really early in the life of that service and if we do hear those requests we'll certainly make sure that those get directly to the service team.
I think there is already company that has done that. There is a company called Enomaly that has tools to shift between the various different kinds of virtual environments and I am pretty sure they have a way to convert, and I can't tell you for sure which way they go, but I do understand that they have some conversion tools that go either from EC2 to VMWare or the other way around, I don't know enough about their product.