The use of the REST architectural style is steadily gaining momentum for both public facing Web services and enterprise integration. However, one aspect of a service oriented architecture does not yet receive sufficient attention: Service Discovery.
In this article, I will describe how existing Web technology can be leveraged to enable Service Discovery for RESTful Web services.
Looking for a Service Discovery Solution
A client that wishes to interact with a service needs an initial URI to enter the application provided by the service. How can a client get hold of such an entry URI? There are three alternatives:
- Hard-coding or configuring the URI into the client.
- Creating a dedicated HTTP-based solution (a service registry Web application).
- Leveraging some existing service discovery mechanism.
At present, hard coding or configuring the start URI into the client seems to be the prevalent approach. However, the coupling introduced by this practice is a disadvantage because it limits the ability of the service provider to relocate the implementation or to apply certain forms of load balancing.
Providing an HTTP-based service registry by creating a dedicated Web application and associated standard media types is a viable option but should only be the last resort if no existing technology can be leveraged.
Using an existing, already deployed technology is obviously the best solution and we need not look very far to identify a large scale, ubiquitous technology that provides a suitable solution for service discovery: the Domain Name System (DNS). DNS is a distributed database primarily known for the purpose of looking up the IP-address that corresponds to a given host name but it is also used for other retrieval tasks such looking up the mail server responsible for a given domain (DNS MX records).
This article describes how to leverage standard DNS functionality for service discovery.
Service Discovery Requirements
What are the desired capabilities for a service discovery solution? The common scenario is a client that queries the discovery mechanism for a list of services that implement a certain functionality and then picks one of the returned services for interaction. Usually, the client has the intention to limit the query to some context, for example, a client might want to ask not for all shopping services, but for the shopping services at www.examplebooks.com.
We can break this apart into a list of requirements: A service discovery mechanism should enable the client to
- Obtain a list of all services of a certain kind.
- Obtain the entry URI of a specific service instance.
- Obtain additional meta data about a service.
- Specify the discovery perimeter of interest ("Show me all search engines in the domain example.org").
We'll see shortly that the existing Domain Name System supports all of these requirements. However, we must take a detour first and introduce the notion of service types to RESTful systems.
Service Types in RESTful Systems
One of the requirements of service discovery is to enable the discovery of a service based on desired capabilities. A client would typically ask for a list of services of a specific kind and then select one of the available services either based on some additional meta data or at random.
Unfortunately, the notion of service types is not yet present in the field of REST Web services and in order to enable type-based service discovery we must first introduce a means to specify service types.
What is the appropriate place to define a service type and its name? Given that the only documentation to be used in RESTful systems is the specification of media types and link relations the natural place to specify a service type is within a media type specification.
How would that work? Let's look at two practical examples concerning blogging services and search services.
Examples of Defining REST Web Service Types
Atom Publishing Protocol
The Atom Publishing Protocol specification defines a set of media types and link relations that can be used to implement a publishing interface. The original use case for the specification was weblog publishing, but the Atom Publishing Protocol can be used to create a service interface for any system that manages items and organizes them into item collections.
Defining a service type for Atom Publishing Protocol services would be as easy as specifying a name for that service type inside the Atom Publishing Protocol specification. Clients could then use that name in order to query service discovery mechanisms for services that implement the Atom Publishing Protocol. For the purpose of the article, lets define the service name _atom_http for Atom enabled services. (Do not worry about the underscore or the format now, it is a DNS convention which I'll explain later.)
OpenSearch
The OpenSearch specification states "The OpenSearch description document format can be used to describe a search engine so that it can be used by search client applications." We need not look far to see the service type implied here: search engines. Turning that into a service type name like the one for Atom Publishing Protocol services yields something like _search_http
Adding this service type name to the OpenSearch specification would be sufficient to use this name for DNS-based service discovery.
The Domain example.org
I'll use the service types _atom_http and _search_http in a practical example system in order to illustrate the ideas discussed in this article. The diagram below shows the domain example.org. There are two hosts mars.example.org and neptune.example.org on which several Atom Publishing Protocol- and Search services are running.
On the host mars.example.org there are two Atom services that provide access to a knowledge base and a news feed. A Search service provides retrieval access to the news archive. The machine neptune.example.org hosts a couple of blogs and an associated search service provides retrieval access to the entries in the hosted blogs.
According to the stated requirements, a service discovery mechanism should be able to answer the following questions:
- Which services are there that support the Atom Publishing Protocol?
- Which services are there that support search as specified by the OpenSearch specification?
- What is the location of any given service?
- What meta data is available for any given service?
In addition, it should be possible to limit the answer to the domain example.org.
A (Very) Brief Introduction to DNS
In the early days of the Internet, host names where mapped to IP-addresses in a static file (hosts.txt) located on every host. When the number of hosts on the Internet grew updating this file became a scalability problem. DNS has been created to solve this scalability problem by decentralizing the administration of the database. Decentralized administration is achieved by delegation, which means that the owner of a certain segment of the index path (a domain) can delegate the ownership of portions of that domain to other organizations thereby also delegating administrative responsibility (and burden).
DNS is a path-indexed, distributed database that has the following major capabilities:
- Distribution - The DNS database is distributed across the many nameserver hosts of the Internet.
- Delegation - Owners of a certain segment of the index path can delegate ownership of portions of their domain.
- Typed Resource Records - DNS can associate several kinds of data records with a given domain. Resolvers specify in their queries which kind of data record they are interested in.
- Caching - In order to improve performance, DNS has built-in caching.
- Fault tolerance - DNS servers can be replicated to provide continuous service even if individual name servers are not operating.
The administrative owner of a domain operates nameservers that store information and answer DNS queries about the domain. Nameservers typically have complete information about some part(s) of the domain called zone(s).
On startup a nameserver loads the information about one ore more zones from a file (or from other nameservers) and is then able to answer queries about theses zones.
When a DNS client (a resolver) queries DNS for information about a certain domain it contacts its configured "go-to" nameserver which in turn queries other nameservers until the requested resource record has been located and is sent back to the client.
Leveraging DNS for Service Discovery
DNS can associate various kinds of so called Resource Records with a given domain. In addition to the more familiar ones like A-records (for address lookup) or MX-records (for mail server lookup) DNS defines the resource records types SRV (for service location), TXT (for arbitrary text data) and PTR (for expressing a reference to another domain). A combination of the latter three is used to provide service discovery capabilities with DNS.
SRV Resource Records
SRV Resource Records are used to express at which host and port within a zone a certain service is accessible. An SRV line in a zone configuration file looks like this:
_ldap._tcp.example.org. IN SRV 0 0 389 venus.example.org.
The configuration line above states that a service of the type _ldap is available at port 389 on the host venus.example.org.
Note: The fields filled with zero can be used for load distribution purposes and the symbolic names _tcp and _udp are a convention specified by the SRV Resource Record specification. They identify the intended protocol and are simply placed between service type and domain.
Using a resolver software like nslookup we can ask DNS for the ldap services available at example.org:
$ nslookup -q=srv _ldap._tcp.example.org. Server: xxxx Address: ip _ldap._tcp.example.org service = 0 0 389 venus.example.org.
This call to nslookup issues a DNS query for all available SRV records for _ldap._tcp.example.org. (The command line option -q=srv tells the resolver to ask specifically for SRV records)
If there were more LDAP services available, the zone configuration would look like this:
_ldap._tcp.example.org. IN SRV 0 0 389 venus.example.org. _ldap._tcp.example.org. IN SRV 0 0 389 mercury.example.org. _ldap._tcp.example.org. IN SRV 0 0 389 earth.example.org.
For this configuration the above query would yield
$ nslookup -q=srv _ldap._tcp.example.org. Server: xxxx Address: ip _ldap._tcp.example.org service = 0 0 389 venus.example.org. _ldap._tcp.example.org service = 0 0 389 mercury.example.org. _ldap._tcp.example.org service = 0 0 389 earth.example.org.
The meaning of this reply is that there are three LDAP servers in the example.com domain along with information about their location (host and port).
DNS Service Discovery (DNS-SD)
SRV records have a significant limitation regarding the stated service lookup requirements: they cannot be used to configure named instances of a service type and they only support a single service for any given host and port combination. It is, for example, not possible to configure SRV records for two search services that run in the same Web application and therefore share a common host and port. In addition, SRV records do not support the configuration of meta data for a specific service instance.
The DNS Service Discovery specification has been developed to overcome this limitation. DNS-SD combines SRV, PTR, and TXT resource records to meet all the requirements for service lookup. DNS-SD uses the three records types in the following way:
- PTR - used to map service types to named service instances.
- SRV - used to provide location and port for service instances.
- TXT - used to provide additional meta data about service instances.
Configuring the List of Service Instances - PTR Records
DNS-SD combines service type domain names with PTR records to map service type names to service instance names. This enables the retrieval of the list of all instances of a given service type.
The following zone configuration lines illustrate this idea:
# Note that all names are relative to example.org, for example, # _atom_http._tcp is really _atom_http._tcp.example.org. _atom_http._tcp PTR KnowBase._atom_http _atom_http._tcp PTR News._atom_http._tcp _atom_http._tcp PTR JimBlog._atom_http._tcp _atom_http._tcp PTR MaryBlog._atom_http._tcp _atom_http._tcp PTR SallyBlog._atom_http._tcp _search_http._tcp PTR NewsSearch._search_http._tcp _search_http._tcp PTR BlogSearch._search._tcp
On the left hand side of each PTR line a service type domain name is given and on the right hand side a corresponding instance of that type.
DNS Service Discovery specifies the following service instance naming convention:
<Instance>.<ServiceType>.<Protocol>.<Domain>
For example, the full instance name for Jim's Blog service would be
JimBlog._atom_http._tcp.example.org
A query for all services of type _atom_http._tcp at example.org would now be expressed as (note the use of ptr instead of srv in the -q option):
$ nslookup -q=ptr _atom_http._tcp.example.org Server: xxxx Address: ip _atom_http._tcp.example.org name = MaryBlog._atom_http._tcp.example.org. _atom_http._tcp.example.org name = SallyBlog._atom_http._tcp.example.org. _atom_http._tcp.example.org name = KnowBase._atom_http.example.org. _atom_http._tcp.example.org name = News._atom_http._tcp.example.org. _atom_http._tcp.example.org name = JimBlog._atom_http._tcp.example.org.
DNS-SD interpretes the PTR-query as a query for a list of instances of the service type specified by the lookup domain (here: _atom_http._tcp.example.org). The right hand side of the result therefore provides the retrieved service instance names.
Using the service instance names the system setup diagram of the domain example.org now looks like this:
Configuring Service Instance Data - PTR- and TXT Records
The DNS-SD specific use of PTR records enables the DNS client to obtain a list of service instances. Service instances in turn are described using SRV and TXT records. The SRV records provide information about the host and port of a given service instance and the TXT records provide additional meta data pertaining to that instance.
DNS-SD specifies a generic key-value format for TXT records (key1=val1,key2=val2,...) and defers the specification of the keys to be used with a given service type to the type's specification.
In the case of HTTP-based services such as the examples _atom_http and _search_http at least a path key is necessary to construct the entry URI of a service instance. Here is the example for Jim's Blog service:
JimBlog._atom_http._tcp SRV 0 0 80 mars.example.org. TXT path=/blogs/jim
This configuration expresses that the service instance JimBlog._atom_http._tcp.example.org can be accessed on mars.example.org at port 80. The TXT record specifies a path parameter which the client must use for constructing the service's entry http-URI. The path parameter and the fact that the URI must be an http URI must be part of the specification of the service type. Hence Atom services using https would need a different service type, for example _atom_https.
Instance information about Jim'S blog service can be retrieved like this:
$ nslookup -q=any JimBlog._atom_http._tcp.example.org. Server: xxxx Address: ip JimBlog._atom_http._tcp.example.org service = 0 0 80 neptune.example.org. JimBlog._atom_http._tcp.example.org text = "path=/blogs/jim"
From which the client can construct the entry URI of the service according to the rules defined by the service type specification. For the given example the resulting service URI would be:
http://neptune.example.org:80/blogs/jim
The following diagram shows the service hosts mars.example.org and neptune.example.org and the nameserver host nameserver.example.org. The large text artifact shows the complete zone configuration file for example.org.
At the top there is a SOA record which indicates that this nameserver is authoritative for the given zone (SOA = "start of authority") and below that we see a couple of lines that define the nameserver (NS record) for the zone and the IP-addresses (A-record) for the hosts used in the example.
The rest of the file contains the configuration for the complete example system as discussed in detail for Jim's blog service. The configuration entries for Jim's blog service are highlighted to show how they refer to one another and eventually to the service instance on neptune.example.org.
Complete Interaction Example
I have discussed, how DNS-SD uses PTR-, SRV-, and TXT resource records to enable DNS based service discovery. To summarize, the following examples show how a complete lookup process would look like.
1. Retrieve list of Atom Publishing Protocol services at example.org.
$ nslookup -q=ptr _atom_http._tcp.example.org Server: xxxx Address: ip _atom_http._tcp.example.org name = MaryBlog._atom_http._tcp.example.org. _atom_http._tcp.example.org name = SallyBlog._atom_http._tcp.example.org. _atom_http._tcp.example.org name = KnowBase._atom_http.example.org. _atom_http._tcp.example.org name = News._atom_http._tcp.example.org. _atom_http._tcp.example.org name = JimBlog._atom_http._tcp.example.org.
2. Retrieve location and meta data of desired instance.
$ nslookup -q=any MaryBlog._atom_http._tcp.example.org. Server: xxxx Address: ip MaryBlog._atom_http._tcp.example.org service = 0 0 80 neptune.example.org. MaryBlog._atom_http._tcp.example.org text = "path=/blogs/mary"
3. Construct the service entry URI based on the definition of the service type _atom_http.
http://neptune.example.org:80/blogs/mary
4. Access the service via HTTP to obtain initial application state.
$ curl http://neptune.example.org:80/blogs/mary 200 Ok Content-Type: application/atomsvc+xml <service> ... </service>
In the example shown above, the client picks any of the service instances returned by the first query. This is only appropriate if the client does not need to differentiate between the instances, for example if the services provide a lookup or transformation functionality. However, in most cases the instances will be significant to the client based on the data they operate on. In such scenarios I expect the service instance configuration to specify the desired service instance name and let the client late bind to the service based on the instance name.
Overall Advantages of Using DNS
Applying an already existing technology, especially one that has been ubiquitous for over a decade, has numerous advantages, such as:
- A wide range of well tested implementations is available, many open sourced.
- Knowledgeable developers and administrators are widely available.
- Existing ubiquitous use protects investment in the technology.
DNS itself has at least the following specific advantages:
- Reliability through replication
- Built-in caching
- Delegation of administrative responsibility and burden
- Easy to configure
- Well supported (DNS-SD is the basis of Apple's Bonjour Protocol)
Summary
Service Discovery is an essential aspect of service orientated architecture because it avoids early binding of clients to particular service instances. Removing such coupling provides greater flexibility for reconfiguration of the overall system.
Service Discovery can be easily introduced to systems of RESTful Web services by leveraging standard DNS mechanisms as specified by DNS-SD. DNS based service discovery is readily available to anyone in any system environment given the ubiquitous availability of DNS nameserver and resolver implementations.
By applying DNS to enable Service Discovery you gain the performance and reliability of an Internet technology that has been deployed successfully at global scale for over a decade.