According to William Vambenepe
Events/alerts/notifications have been a central concept in IT management at least since the first SNMP trap was emitted, and probably even long before that. And yet they are curiously absent from all the Cloud management APIs/protocols.
And yet the bulk of cloud management APIs today are based on polling. According to George Reese such polling approach:
... results in an incredibly inefficient use of CPU power... [at] the cloud provider as well as wasted bandwidth on both ends. We certainly do a number of optimizations to make sure we are polling as infrequently as we can get away with... [but] the bottom line remains, however, that most of our calls are wasteful.
To solve this problem, Reese proposed an event-driven API through which "the cloud provider notifies us of changes in resources we care about" He also noted some of the challenges his proposal is going to face:
- An event-driven API demands a level of standardization that just doesn't exist in the cloud world today. You can't have every consumer designing its own callback API, and even supporting a cross-cloud system with every cloud provider defining its own callback protocol is problematic.
- You can't provide data via the callbacks because providing data requires reverse authentication and complicates the entire process.
- In the end, the consumer can't fully trust the callback API. It still needs to make some calls on its own to verify the cloud provider is actually working properly.
Reese proposed some simple solutions to deal with those challenges:
- The introduction of a standardized callback format (should be very simple, something like [consumer-base]/[asset class]/[id])
- Integration of event callbacks into the existing cloud APIs.
Based on his experience with the WS-Notification family of specifications, Vambenepe strongly supports the idea of simple APIs. In his opinion, a cloud-centered eventing protocol can be made simpler by focusing on fewer use cases (cloud scenarios only). The following elements, according to him, can be used as a foundation for a cloud eventing implementation:
- Types of event. The client should be able to specify what kind of resource information he is interested in.
How do you describe the changes you care about? Is there an agreed-upon set of states for the resource and you are only notified on state transitions? Can you indicate the minimum severity level for an event to be emitted?...
In Vambenepe’s opinion, WS-Topics represents the best approach to solve these problems. - Event formats. Cloud events should adhere to a standard event model defining:.
How is the event metadata captured (e.g. time stamp of observation, which may not be the same as the time at which the notification message was sent)? If the event payload is a representation of the new state of the resource, does it indicate what field changes (and what the old value was)?
- Subscription creation. A standard subscription mechanism has to be created, considering the following questions:
Do you get to change what filter the subscription carries? Can you change the delivery endpoint?... Who sets the expiration period? Can the provider set a max duration? Can you renew a subscription...? What if your subscription is being killed because your deliver endpoint is down?... Do you provide a separate... subscription management... endpoint (different from the event delivery endpoint) when you subscribe?
- Delivery mechanism. There are many options for event delivery ranging from permanently opened HTTP connections (similar to COMET long polling) to HTTP callback URL to XMPP, to AMQP, to email.
- Security. Depending on the delivery mechanism, different security implementations might be required.
- Throttling. An eventing implementation should ensure that neither resource, nor consumers are overwhelmed by the amount of the emitted/received event.
As InfoQ has previously reported, asynchronous APIs are very important for cloud computing. Introduction of a full-fledged eventing protocol can solve this problem and more. Hopefully, authors of this specification will learn from WS-Notification and create the one that will be simpler without sacrificing coverage and reach.