Skip to main content

9.8.7 Procedures for Administration of the Enterprise Data Repository

I. Introduction

Data is one of the most valuable resources at Illinois State University. Students, faculty, and staff are sophisticated information consumers, relying more than ever on university data to meet their information needs. Successful administration of data is imperative to support the knowledge-based needs of the University. The following procedure establishes the framework for maintaining data so that it is accurate, complete, and secure.

II. Guiding Principles

The students, faculty, and staff of the University are best served when enterprise data is shared and used appropriately. The information assets of the University are diminished when data is lost, misused, misinterpreted, or has unnecessary restrictions on its access. To assist in the appropriate use of the Enterprise Data Repository (EDR), this procedure is designed to achieve a mix of three guiding principles - confidentiality, integrity, and availability.

Confidentiality

Data contain within the EDR should be available to meet the legitimate needs of members of the University community including access to sensitive information. It is understood, however, that some data may be subject to legal and ethical considerations which define and regulate its responsible use. The University's Master Data Access Plan should contain provisions to ensure user sensitivity to appropriate use of information - especially information considered "confidential." Controls must be in place to minimize the risk of unauthorized disclosure of data from the EDR.

Integrity

The University community should trust the integrity of data contained within the EDR. Therefore, data should be collected and maintained to guarantee its consistency, reliability, timeliness and accuracy and to avoid duplication and disparity across databases. Appropriate security measures should be provided that will protect data within the EDR from compromise or unauthorized access, modification, destruction or disclosure. Authorized users share responsibility and are accountable for their use and access of the EDR, requiring on-going education on the part of those who use and care for it.

Availability

The intent of this procedure is to allow for the ease of use and access to data according to the authorized and legitimate needs of members of the University community. Legitimate access is defined as access that provides information necessary for one to carry out one's assigned duties or to fulfill a role or function. Data custodians should ensure that data within their care is highly available and that the terms of service level agreements are met.

III. Data Capture and Storage

Critical to the success of the EDR is the manner that the data is captured and stored. Data Stewards are responsible for specifying what data elements will be captured and for identifying the system-of-record for each data element. In keeping with Master Data Management principles, Data Stewards will identify the authoritative source for each data element. Any derivative system using the same data elements should always receive their data from the authoritative source or from an operational data store (ODS) or enterprise data warehouse (EDW) that directly receives the data from the authoritative source. The same data clement should not be independently maintained in multiple data repositories. Delegation and decentralization of data collection and maintenance responsibility is encouraged in order to assure that data are efficiently updated at or near the data source or creation point. Furthermore, data-handling steps that do not add value should be eliminated.

Data custodians are responsible for ensuring that source systems have an appropriate amount of storage and that controls in place to secure the data at the storage and server levels (including local file storage, database management system, application, and operating system levels). The security put in place at the storage and server level should be based upon the highest level of data classification housed within the data repository. For example, if one data element housed on the server is classified at the highest data classification level, the security for the server and storage must be set to the highest level.

Data Stewards are responsible for developing data archiving requirements and strategies for each data element. The Data Stewards will pass these requirements to Administrative Technologies who will determine the proper archiving mechanism and storage location for the archived data. The capture of historical data into an enterprise data warehouse does not relieve the Data Steward of the responsibility for maintaining archives of detailed transactional data in accordance with University record retention requirements.

IV. Data Integrity, Validation, and Correction

Data Stewards are responsible for assuring that applications that capture and update Enterprise data incorporate appropriate edit and validation checks to protect the integrity of the data. Users of the data are responsible for helping to correct data problems by supplying as much detailed information as possible about the nature of the problem to the Data Steward. Once notified by a data user, the Data Steward will take corrective measures to correct the data in the system of record, find the cause of the erroneous data, and notify users who have received or accessed the erroneous data.

V. Data Extracts and Reporting

Data Stewards are responsible for specifying business rules regarding the manipulation, modification, or reporting of data elements contained within the EDR. Data Stewards are also responsible for establishing standard data transformations to create pertinent summary or derived data. Summary or derived data are considered part of the EDR and are subject to the same data management standards. Data Stewards are responsible for specifying the proper dissemination of all EDR data and will take measures to correct unauthorized use of EDR data.

All sets of data extracted or reported from the EDR should include the time and date they were extracted from the source operational system(s), so the currency of disseminated data can be clearly communicated. Data Stewards will work with users to define useful and meaningful schedules for creating standard data extracts. These standard extracts of the data ("data snapshots") are considered part of the EDR and therefore subject to the same data management standards.

VI. Data Views

A data view is a logical collection of data elements, assembled and presented according to a prescribed set of rules. Unlike a data extract that captures data at a fixed point-in-time and often includes moving the data to a secondary physical storage location, a data view is a logical subset of stored data. A data view typically assembles the most current or pertinent data from the primary storage location at the time of access. Data Stewards are responsible for defining standard data views of enterprise data within the EDR. These data views are considered part of the EDW and therefore subject to the same data management standards.

VII. Data Warehousing

The Information Architecture Team from Business Intelligence and Technology Solutions, along with the Data Stewards, are jointly responsible for establishing an information repository known as the Enterprise Data Warehouse (EDW). The EDW stores sharable historic data from operational systems-of-record, as well as transactional data derived from the operational data and deemed to be useful management information. It supports user queries to track and respond to business trends and to facilitate forecasting and planning efforts. The EDW will often contain summarized data derived from transaction detail and may not contain all the supporting transaction detail stored in the operational system-of-record or in the data archives. The EDW design is based on a Data Integration Model (DIM), which is a logical construct that describes entities that comprise EDR data. The Data Integration Model clarifies the linkages among data collected or maintained by the various organizational units of the university. The Data Stewards work with the Information Architecture Team from BITS to develop and maintain the Data Integration Model. The Information Architecture Team from BITS provides expertise and software tools for data modeling, as well as the infrastructure of the EDW.

VIII. Metadata

To ensure that users have a common understanding of the definitions of data contained within the EDR, the Business Intelligence and Technology Solutions department is responsible for the custodianship of an enterprise business metadata repository. The Metadata Repository contains both technical and business descriptions and definitions about EDR and EDW data. It assists the data user with understanding the source, meaning and proper use of the warehoused data. The Data Steward works with the Information Architecture Team from BITS to develop and maintain the Metadata Repository.

The business metadata repository will house descriptions including, among other things, the source system of the data, what the data means, the relationship of the data to other data elements, and any derivative systems that use the data. Data Stewards are responsible for maintaining the content of the business metadata repository, including specifying derivative systems housing data.

Documenting the EDW data is the responsibility of the Data Stewards. Data documentation and definition guidelines are established by the Director, Business Intelligence and Technology Solutions to include the following minimum elements:

  • Name and Alias Names
  • Business Description
  • Data Steward Identification
  • Usage and Relationships
  • Source and Procedure for Data Capture
  • Frequency of Update
  • Official System-of-Record Location and Format
  • Data Classification
  • Descriptions of the restriction and the access procedures for data classified in the highest data classification
  • Description of Validation Criteria and/or Edit Checks
  • Description, Meaning, and Location of Allowable Codes
  • Archiving Requirements and Procedures
  • Data Storage Location of Extracts
  • Quality/Reliability Rating

Documentation for derived EDW data should also include the algorithms or decision rules for the derivation.

Documentation of data views should include reference to the data elements which comprise the view and description of the rules by which the view is constructed.

Overview documentation for logical segments of the EDW (databases, files, groups of files) should also be provided to include information about data structure and update-cycles necessary for the accurate interpretation of the data.

IX. System Administration

University enterprise data may be stored on a variety of computing hardware platforms, provided such platforms arc fully integrated components of a managed EDR. Whenever university enterprise data are stored on any component of a university information system, that system component must have a designated system administrator whose responsibilities include:

  • physical site security
  • administration of security and authorization systems
  • backup, recovery, and system restart procedures
  • data archiving
  • capacity planning
  • performance monitoring

Last Review: July 2013