Fred Cummins, an EDS fellow, and SOA veteran wrote an essay last week on "Data Management for SOA". He is looking at how some of the key tenets of service design ("loose coupling" and "autonomy") relate to enterprise data in the context of achieving reuse and enabling change.
Even though Fred acknowledges that these tenets are essential to deliver the value of SOA:
The value of SOA comes from the ability to integrate [services] in multiple business contexts, and the ability to optimize and adapt them with minimal impact on their users.
he also notes:
However, this decoupling and autonomy conflicts with the use of shared databases.
Those who focus on data management have, for decades, driven the industry toward consolidation of databases under a philosophy that tighter coupling means greater efficiency and consistency.
The data gurus are struggling with reconciling [Enterprise] Data Management with the loose coupling of SOA.
He points to Jill Dyché who is recommending:
start with the [master] data. This sounds counterintuitive, since SOA is about offering standardized business processes as services, but the concept of data as a service is actually more viable for companies just beginning to think about SOA.
and Dan Gardner who observes in "SOA and compute clouds point to rethinking data entirely: roles and permissions, not rows and tables" that:
much of an enterprise's data is no longer controlled by the IT organization
But Fred dismisses these two considerations which he finds only marginally relevant to SOA. He proposes that:
The data that must remain the primary focus of attention for SOA are the data produced, consumed and managed by business systems that represent the past, present or future state of the enterprise. From a business perspective, the concerns are not a matter of distributed storage but how the data are validated, managed and protected.
Fred points out that he is in agreement with Steve Karlovitz:
As a single entry point to all enterprise data stores, the implementation of a data service layer has many benefits.
- Data access can now be performed in a centralized manner.
- The various business rules will be referenced for how the data transformation will occur.
- With a single entry point, issues such as optimization and transformation can be addressed.
- [It ] ensur[es] data integrity and security
- an organization will dramatically reduce time to market on new features
But the question is how can we design this Data Service Layer? Fred sees 3 different possibilities:
- the data service layer is a data access facility that supports database access by all applications using a canonical view of a shared database similar to a object-relational transformation facility,
- data from heterogeneous application databases is replicated and integrated in an enterprise database with a canonical data schema,
- access to heterogeneous databases is provided through requests expressed as queries on a canonical, virtual database.
The first one, is essentially a concept of a shared database. Fred points out that it is essentially impractical because:
many services will continue to use legacy or COTS systems that incorporate their own databases [and] heterogeneity of service unit implementation technologies is fundamental to SOA agility
The second approach leverages the concept of an "operational data store" that is now common in Enterprise Data Management and Business Intelligence groups. There he still sees some issues:
there will be delays in the updates from various sources, so achieving a fully consistent view may still be difficult. This replicated data should be used only for queries-it would be very difficult to manage updates. The master data, "the single version of the truth" is still in the source databases and must be controlled by their owners.
The third approach is well aligned with Enterprise Information Integration capabilities. Fred notes that:
EII did not gain much market acceptance when it was introduced several years ago, but with SOA, its time has come.
He advises however that:
While some EII tools support updates to the heterogeneous databases, updates should still be controlled by the service units that own those databases.
This comment can be correlated with a recommendation to build service interfaces along Object Lifecycles, which was also echoed here.
Fred concludes:
Data management for SOA should be approached as requiring an enterprise logical data model, mechanisms for federation and sharing of data among relatively autonomous service units, and a data management plan that defines responsibilities, flows, master data stores, latency of updates, synchronization strategies and accountability for data integrity and protection.
It has often been said that BPM and SOA are the "two faces of the same coin". However, as SOA implementations reach new levels of maturity, people realize that Data, and their relational nature, need to be considered as well, perhaps even more so than Business Processes. Creating services that "validate, manage and protect" data is not easy, it requires some technologies and a precise methodology that are relatively new to most IT organizations. In particular, it brings Enterprise Data Management at the core of your Service Oriented Architecture. Ultimately, the concept of Object Lifecycle seems to emerge as a unifying concept for Data, Service and Business Processes.