Data Governance Framework
Data have been collected in various forms over long periods of time, largely residing in on premise or cloud-based repositories. As such, data governance has been informally adopted over the same time period, inviting disparate processes and policies regarding the use and management of data.
Coordinated understanding and management of these repositories are required to better equip the university in its pursuit to ensure data are effectively protected and to harness information for the purposes of data-driven decision-making.
It is important that Western be deliberate in its approach and attune its data resources towards the vision, mission, principles, success pillars, and programs associated with its institutional data.
Paying attention to our institutional data ecosystem through this framework allows us to be conscientious about why and how we engage with data, including attendant processes that relate to quality and reliability.
Using Seiner's (2014) conceptual framework, the pyramid above is a representation of the approach Western has adopted towards data governance. Within this context, it is recognized that virtually all of the data governance roles found within this model are already being performed throughout the university, even if on informal bases. This framework is designed to formalize and strengthen the accountabilities and responsibilities within each level.
Conceptual Frame Hierarchy
At the highest level of the data taxonomy we have five Data Estates, defined as ACADEMIC, RESEARCH, PEOPLE, SPACE, and SERVICES. This layer of the hierarchy is known as the CONCEPT and exists at the top of specific data columns. However, it is important to note that many of the relationships in play at Western exist in a multiple path format where data components connect with various columns simultaneously. This complexity is anticipated and our model supports the multiplicity of accountabilities and responsibilities throughout.
Below Data Estates can be found Data Domains, which function as the broad SUBJECT areas of our data ecosystem with items defined as STUDENT, EMPLOYEE, etc. This list is much larger in nature than Data Estates and will be published in our data asset catalog and continually amended.
At the final level of the conceptual stage we find Functional Data Domains, which is where we would find items such as UNDERGRADUATE, GRADUATE, FACULTY, STAFF, etc. This list will be even longer than the Data Domains and will be published in our data asset catalog and continually amended.
An example of the conceptual lineage can be found below:
"PEOPLE > STUDENT > UNDERGRADUATE"
Actual Frame Hierarchy
Understanding data conceptually is important, but actual data exist within systems. At this point, we turn our attention to the Data Application and, at the lowest level of the taxonomy, Data Items. While Data Application can, and typically does, refer to a software package or a spreadsheet, at its broadest definition we are referring to where data reside. The Data Item is at the field or elemental cell level of a data solution.
An example of the extended data lineage can be found below:
"PEOPLE > STUDENT > UNDERGRADUATE > 'SOFTWARE' > STUDENTID"
Why is this Important?
The point of a robust data governance system is to understand how institutional data are bing used within the Western operational context and to furnish the ecosystem with the information required to make decisions about how we wish to manage and leverage these important assets. At each level in the taxonomy (Data Estate, Data Domain, Functional Data Domain, Data Application, and Data Item), there are roles within the institution that are accountable or responsible in specific ways. Data lineage assists us in mapping accountabilities and responsibilities to data assets so that appropriate and clear decision-making can be made in service of Western's broader strategies.
Functional Data Domain to Data Application Mapping
It is sometimes a challenge to see the linkage between the conceptual hierarchy of the data model and the actual. The actual reflects the lived experience of data and how data applications represent information with their respective systems. The conceptual frame may not have sufficient awareness of what fields are being used in the applications and how they apply to functional data domains. One of the ways that can be used to marry these components together is through an institutional data glossary, which will allow for high-level definitions that are then mapped to how related fields are used within applications.