Designing, implementing, and maintaining APIs for the Web is more than a challenge; for many companies, it is an imperative. This series takes the reader on a journey from determining the business case for APIs to a design methodology, meeting implementation challenges, and taking the long view on maintaining public APIs on the Web over time. Along the way there are interviews with influential individuals and even a suggested reading list on APIs and related topics.
This InfoQ article is part of the series “Web APIs From Start to Finish”. You can subscribe to receive notifications via RSS.
A lot of (virtual) ink has been spilled on the theory of using hypermedia in your APIs, but what about the practice? At the recent API Craft 2014 conference, many people shared stories of using hypermedia, but sheepishly admitted that they hadn't talked about it publicly. Many of us resolved to share more of our stories. I'd like to share with you four of my own.
In this article, we'll talk about four different real-world implementations of hypermedia: how you may already be using hypermedia through image links, how GitHub uses the Link header for pagination, using hypermedia in constrained systems like iOS, and how Balanced uses hypermedia principles to build their product. Each of these scenarios are different, and showcases a different aspect of using hypermedia in your API designs.
Thumbnail Images
While much of the theory of hypermedia talks about hypermedia as the fundamental, underlying theory of your entire API, I have a little secret to share with you: it doesn't have to be that way. You can gain some of the advantages of hypermedia without doing an entire overhaul of your API. I'd like to share two cases that I see most often: thumbnail images and pagination.
You may be using some hypermedia without even realizing it. I see this most often in APIs which contain images, often thumbnails. Consider this (partial) API response, from Twitter:
GET https://api.twitter.com/1.1/users/show.json?screen_name=rsarver { "name": "Ryan Sarver", "profile_image_url": "http://a0.twimg.com/profile_images/1777569006/image1327396628_normal.png", "created_at": "Mon Feb 26 18:05:55 +0000 2007", "location": "San Francisco, CA", "profile_image_url_https": "https://si0.twimg.com/profile_images/1777569006/image1327396628_normal.png", "utc_offset": -28800, "id": 795649, "lang": "en", "followers_count": 276334, "protected": false, "profile_background_image_url_https": "https://si0.twimg.com/profile_background_images/113854313/xa60e82408188860c483d73444d53e21.png", "verified": false, "time_zone": "Pacific Time (US & Canada)", "description": "Director, Platform at Twitter. Detroit and Boston export. Foodie and over-the-hill hockey player. @devon's lesser half", "profile_background_image_url": "http://a0.twimg.com/profile_background_images/113854313/xa60e82408188860c483d73444d53e21.png",
Nobody would suggest that Twitter has a hypermedia API, but this response does indeed contain links, and therefore, is employing hypermedia. One could include these images inline using a data URI, but instead, links are employed.
By including links, an API client is allowed to have a choice as to which information it would like to download. The links inform the client of which options are available and where to find them. In other words, this is just as 'real' of hypermedia as any other usage.
To make these benefits a bit more clear, let's consider a slightly different response:
GET https://api.example.com/profile { "name": "Steve", "picture": { "large": "https://somecdn.com/pictures/1200x1200.png", "medium": "https://somecdn.com/pictures/100x100.png", "small": "https://somecdn.com/pictures/10x10.png" } }
By introducing hypermedia here, we don't include all three versions of the profile image. We tell our clients that there are three possible images available, and we tell the client where it can find each image. Our client is now able to make a choice about what it wants to do, based on what it's trying to accomplish in the moment. It also does not have to download all three versions if it only wants one. We've made our payload smaller, we've increased client flexibility, and we've increased discoverability.
What I'm getting at here is that you may already be deploying a teeny bit of hypermedia, you just never thought about it before. And you didn't need to design your whole API around hypermedia to gain the benefit in this one case.
Pagination
Pagination is another area where a tiny bit of hypermedia can considerably simplify client code. Let's take GitHub as a real-world example of this. In their documentation, GitHub talks about one of the constraints of their API:
Different API calls respond with different defaults. For example, a call to list GitHub’s public repositories provides paginated items in sets of 30, whereas a call to the GitHub Search API provides items in sets of 100
It's easier to communicate the default when the response is inline. Let's examine how GitHub actually implements this.
When you make a request to a paginated resource, such as their search resource:
GET "https://api.github.com/search/code?q=addClass+user:mozilla"
It will return a Link header:
Link: <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2>; rel="next", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=34>; rel="last"
The Link header, defined in RFC 5988, gives us, well, links. The links consist of a URL and a link relation, which is where the rel comes from. Because we're on the first page of the results, GitHub shows us that we have a next and last option.
If we fetch the link at next, we get a different set of headers:
Link: <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=15>; rel="next", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=34>; rel="last", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=1>; rel="first", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=13>; rel="prev"
Now, we can see that there's also a prev and first page, too.
So where's the advantage here? Well, client code is easier. Consider Ruby. If we wanted to get the next page of search results in the traditional manner, we'd do this:
require 'uri' url = "https://api.github.com/search/code" per_page = 15 current_page = 1 next_page = 1 # zero based, of course page = (current_page + next_page * per_page).to_s query = "addClass+user%3Amozilla" uri = URI(url) uri.query = URI.encode_www_form([["q", query], ["page", page]]) puts Net::HTTP.get(uri);
With hypermedia, we instead do this:
# response contains parsed body from previous request. puts Net::HTTP.get(response.headers[:link].rels[:next])
Much easier, and far less error prone. Furthermore, if GitHub decides to change the defaults to ten per page, and dis-allows fifteen per page, the second bit of code will not need to change. The first one will, and until you've fixed the bug, your users will be stranded.
iOS hypermedia
One growth area I see for hypermedia is iOS. Here's why: in order to make changes to an iOS app, you have to go through Apple's approval process. But if you use hypermedia, the server can change the behavior of the client. Long ago, myself and some friends did this on a project. Names changed to protect the innocent. ;)
The application we were working on was a podcast application. As such, we served large audio and video files. At the time, Apple had a restriction on audio: you could not serve high-quality audio over the GSM connection. However, we devised a scheme to get around this: links.
When the app was under review, we would have our server serve up the podcast with low-quality audio links. The review would pass, and then after the fact, we would change the server to serve up high quality audio. Our customers would then get a free upgrade to higher-than-technically-allowed files. Sneaky!
Once we had this idea, however, we applied it to other aspects of the application. For example, some podcasts would also broadcast live, and allow you to dial a phone number to call into the show. In the UI, we would make the app make a request to the server, asking if the show was currently live or not. If it was, we would display a button which would let you click to call in. The app would poll this endpoint as long as you were on the screen, and once the show was over, it would disable the button. This kind of server-driven interaction is hypermedia's bread and butter, but would be impossible if we had to redeploy a new client to change the state of the button.
In a third instance, the people who ran the podcast could put information about when the podcast would air. Something like "Each Friday at noon" would appear on an information screen. Without fetching this information from our server, if they wanted to change the time the show was released, users would need an app update, all of the 'waiting for review' that entails, in order to have correct information. Because the profile was server driven, as soon as the show operator would click 'save,' all the apps would essentially update with that new information.
What does this have to do with hypermedia? Well, two ways. First of all, the mentality that the server dictates the possibilities and the client displays those possibilities is central to the hypermedia way of API design. All three of these instances are a great example of this principle in action.
Second, the way we implemented this was through links. Upon app startup, the app would fetch a configuration XML file from the server, which would provide a link to the RSS feed, a link to the 'find out if we're on the air' feed, and a link to where the profile information was. The app would then use these links as appropriate. To implement the RSS switch, we'd just change the link from 'low quality' to 'high quality', and now you're fetching an entirely different feed by default.
There's a lot of fruitful ground to be explored here. Clients that react to what a server says without updates are crucial in any area where you cannot often update the client. In situations like iOS or embedded devices, this constraint is obvious, but your users probably do not update your client as often as you'd like...
Case study: Balanced
Finally, I'd like to mention some information about hypermedia at Balanced, my previous employer. Balanced's API is hypermedia enabled, and follows the JSON API standard that I co-author. My previous examples were talking about adding a sprinkling of hypermedia to responses, but Balanced is fully hypermedia-driven. This has led to good effects, but also to some challenges.
On the good side, new features are able to be rolled out without breaking older clients. For example, the 1.1 release of the API was the first to completely follow the JSON API spec. After 1.1 was released, Balanced launched a Push to Card feature, which was entirely new. Because of hypermedia, they did not need to release this feature as API version 1.2: older clients simply ignored the new feature, and new clients were able to use it. This makes operations significantly easier, as having many different versions complicates both deployment as well as development. This trend will continue as Balanced continues to add more features to their API.
Hypermedia enthusiasts often talk about the positive sides, but it's not all positive. In the interest of balance, (pun very intended), I'd like to mention one of the downsides. When a customer reports in with a support issue, the first question you need to ask them is "Who are you?" In many cases, that would be "What's your customer ID?" Since Balanced uses hypermedia, they don't have a customer ID: they have a customer URL. Occasionally, customers would be confused when we'd ask for their 'customer URL.' This kind of thing will change over time as more people understand hypermedia APIs, but because we're in the early days, anyone who creates an API that's fully hypermedia driven needs to be willing to help educate their users on how to use it. In Balanced's case, this meant providing clients for many different languages up front, because many people don't know how to develop good hypermedia clients yet. While it's a good idea to give your customers pre-built clients anyway, in this case, Balanced had to, whereas with a more conventional API, they could have made a business decision to focus development efforts elsewhere.
Conclusion
As you can see, hypermedia can take many different forms, and doesn't have to be the sole organizing principle of your API. First, we talked about how you may be using hypermedia without realizing it, via image links. Then, we talked about GitHub, and their pagination example. Next, we went over how a server-driven client doesn't need to be updated as often, which helps in constrained environments like iOS. Finally, we talked about a company who's used hypermedia as a competitive advantage, but not without a drawback or two.
I hope that these real-world implementations of hypermedia help you realize that hypermedia doesn't have to be all or nothing, and that you may already be doing it in some cases. I'm excited to see hypermedia spring up in more and more APIs, and to hear people talk about their successes and failures with the technique.
About the Author
Steve Klabnik is a Rails committer, Rust contributor, author of "Rails 4 in Action," "Designing Hypermedia APIs," and "Rust for Rubyists".
Designing, implementing, and maintaining APIs for the Web is more than a challenge; for many companies, it is an imperative. This series takes the reader on a journey from determining the business case for APIs to a design methodology, meeting implementation challenges, and taking the long view on maintaining public APIs on the Web over time. Along the way there are interviews with influential individuals and even a suggested reading list on APIs and related topics.
This InfoQ article is part of the series “Web APIs From Start to Finish”. You can subscribe to receive notifications via RSS.