BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Opinion: SOA doesn’t need a Common Information Model

Opinion: SOA doesn’t need a Common Information Model

Loose coupling is not just about using a common syntax and protocols, it is also about creating and managing a set of shared semantics.This week, Dave Linthicum provided a set of recommendation on service construction centering on the idea of the abstraction layer vs. a common schema:

1.    You need to face the data first and define a common data or abstraction layer so that the services are not bound to a particular schema, but enjoy the use of the data nonetheless. I would not push a common schema as much as an abstraction layer.
2.    The abstracted or common model should be tested like any other component.
3.    Don't focus as much on force fitting a data model as agreement across the service domains, and leverage a schema mapping layer to provide choices in the future and agility down at the data layer
David’s experience shows that relying on a common set of schemas may prove to be inflexible when designing service interfaces because “it will prevent these services to evolve separately”.

SOA is about creating assets which can be reused in different contexts, often in context unknown at design time, and maximizing the benefits of SOA amounts to maximizing the reuse of your services by the largest set of consumers possible. However, it would be naïve to think that consumers will always be in the position to adopt the point of view of the provider or that both the provider and the consumer can always adopt the same point of view. Even if this were true today, overtime, the consumer and provider may not be in the position to evolve at the same time towards a newer version of the interface.

Even though mediation is not explicit in the w3c’s web service architecture, SOA practitioners have long ago used it systematically to achieve a higher level of loose coupling and enable separate evolutions between the consumers and providers. Whichever mediation mechanism you use: publish/subscribe, orchestration, polymorphic interface… it will always result in using transformations from the consumer schema to the provider schema and back. These transformations may be performed by a coordinator or on premises in the consumer or provider service container.

Since these transformations are inevitable, the question becomes, how can you make it as painless as possible both from a design and runtime perspective? Incidentally, if you were to use a common information model independent of the provider and consumer interface and still want to achieve loose coupling, you would incur the cost of two transformations, not to mention that you still need to transform your message format into a data set consumable by the implementation of the provider and consumer.

The first steps towards more manageable transformations, is to capture the semantics of the information contained in your messages and derive consumer and provider interfaces from these semantics. This is what Dave calls an “abstraction layer” or others call a canonical data model or an ontology. In this abstraction layer, the structure is less important than the normalization of the semantics. This problem is not new, David Webber, way back in 1998 had introduced the concept of bizcodes, to normalize diverging names XML formats and deal elegantly with localization. More recently, the UN/CEFACT has developed a set of standards to help with the management of semantics and data format: the Core Component Technical Specification; one of the concepts being the notion of “context” whereby you can manage the common parts of a schema across 8 dimensions (for instance, it helps manage the commonality between a purchase order in the automotive industry in Germany and a purchase order in the semi-conductor industry in the USA).

Semantics have to be managed precisely under strict governance processes and tested as Dave points out. Traceability to physical artifacts such as a service interface definition or a database schema are key to develop a successful ontology.

Here are some details about using an ontology in a Service Oriented Architecture:
  a)    Service interface need to be designed from the ontology (with their own structure, but utilizing the semantics of the ontology and only these ones)
  b)    For arbitrary consumers and providers, the design and implementation of transformations is greatly facilitated by the traceability established earlier between the ontology and service interfaces, and if they were designed from the same ontology, some of the transformation could be automated or directly inferred.

 If you want to start playing with these concepts, protégé is an ontology editor developed at Stanford University.

Rate this Article

Adoption
Style

BT