Building a Data Maturity Model for Data Governance

In five blog entries spanning five days, the Data Governance Blog provides a quick start guide for developing a Data Maturity Model. An interesting differentiation with this approach compared to earlier approaches to building maturity models for data governance is that it advocates a propriety model customized for a given organization instead of applying standard models that attempt to be universal. In the first of the five part series on Data Governance, the focus is on defining scope and establishing a baseline. When building a maturity model for your data, it makes sense to target a subset of your enterprise data first. Once the scope of data is defined, then a baseline needs to be established. From the article:

What is the lowest maturity level of data in your dataset? Your answer could be something along the lines of, "Unreviewed, unmodeled, no metadata, have no idea what it is", "It is in our corporate data model but no information other than field name, not in metadata", or "Its in our model, we have some old definition that we can no longer consider reliable".

This may take some time to do, but it is important to establish this baseline. The main things to capture are:

1. Is it in your datamodel?
2. Do you have metadata for it?
3. Do you trust the information in the metadata, if it is in there?

On the second day the concept of natural data maturity model progression is introduced:

What we are looking for here is if there are any natural progressions you can see in the data as it stands today. Starting from your lowest level, what is the next step-up in maturity that you already see? If your first level was "Unmodeled, No metadata, no idea what it is", the next step you see in your data could be, "Its in our datamodel but we have no supporting information on it"... Rather than creating a data maturity model and forcing your data to fit into it, we are letting the various stages the data is already in define the maturity path.

The third blog entry flips the focus to the lowest maturity to the highest, and then attempts to bridge the gap by using consistent terminology:

In essence, I want you to take what you did on Day 1 and write down the complete opposite. This should help you identify the highest maturity level. So if your lowest level is "Unmodeled, Unreviewed, No metadata" then the highest optimum level would be "In Datamodel, Reviewed and Governed by the Data Governance Council, Metadata verified and up to date". What this does is keep your maturity model framed around the same items. If you talk about your data model in your lowest level, you should talk about it in every other level, including the highest.

In the forth blog entry working templates for a maturity model are provided, and both entry 4 and entry 5 walk you through appropriately customizing the maturity model for your organization.

First things first, get the following out: Your data governance maturity model template (filled in) and the in-scope data for your program. What you are going to do is take a sample of your data and make sure that you can easily find the position of that data on the maturity model. If I were doing this again, I’d randomly pullout about 40 fields and go one-by-one through them. I’d look at the field, check the model, check if their is metadata, etc., and see if it falls into a level. You need to make sure that all of the data fields fits somewhere on the maturity model… if it is questionable you may not have defined your levels clearly enough. If it falls right between two levels, you may need to define a new level to account for the difference, or incorporate the characteristics into one of your existing levels.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Architecture topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter