University of Massachusetts
Data and the electronic and manual systems through which they are processed have evolved into critical facets of the University structure. Data in these systems is relied on heavily: to perform routine University business; to supply students and external institutions with student related information; to comply with legal and contractual requirements; and as a basis for management decision making. Additionally, although increased automation of the University's administrative operations and research projects provides substantial efficiencies, it also exposes the operations/research to severe disruption if the electronic data systems are not available on a continuous basis.
Additionally data processing and business applications are no longer restricted to mainframe computer environments. The use of distributed platforms (including mid-range computers, client/server technology, and local and wide area networks) for mission-critical functions not only expands the scope of business continuity planning but makes it more important. This increased importance arises from the fact that non-operational areas are finding themselves responsible for systems which are critical or which highly impact the functioning and reputation of the University.
Back to Table of Contents
These Guidelines are issued pursuant to the Board of Trustees' Policy Statement on Business Continuity Planning (Doc. T99-060, adopted August 4, 1999) and :
- define what critical and impacting data systems are and how such systems will be identified and ranked ,
- outline responsibilities related to business continuity planning and implementation,
- provide guidelines for the development, testing, maintenance and implementation of data system specific business resumption plans (BRPs), and
- provide methods for monitoring and enforcing these Guidelines.
Campus procedures relating to business continuity shall apply to all data systems (manual and electronic), designated as critical or impacting data systems of the University of Massachusetts.
The President, or his/her designee, shall ensure that each campus develops, implements and tests BRPs for its critical and impacting computer systems.
The President, together with the Chancellors or their designees, shall appoint a senior level individual (herein referred to as risk managers) at each Campus and Central Administration, who is responsible for risk management and for ensuring that appropriate application administrators perform evaluations of the exposure of critical and impacting data systems to computer and other disruptions. Where the consequences of such disruptions would be significant, the application administrator shall develop, document and test a BRP.
The Information Technology Council (ITC) will determine which University data systems are considered critical or impacting. Additionally, the ITC will rank these critical and impacting systems to indicate which systems are most crucial to the University's functioning. This ranking will be used to determine which data systems need to be recovered and in what order in the event of a disruption. The ITC should consider several factors, including but not limited to, the calendar cycle, processing hardware, system software, applications programs and essential human resources when determining whether systems are critical or impacting.
The risk manager shall:
- ensure that training is available for application administrators in the areas of risk assessment, analysis and management, and BRP development.
- ensure that annual risk analyses and BRP tests are performed by application administrators for all critical or impacting data systems.
- review the results of risk analyses and BRP tests performed by application administrators to ensure that the proper and appropriate controls have been implemented.
- work, as needed, with University audit, application administrators and data center personnel, as needed, to institute appropriate controls for critical and impacting data systems.
- maintain records of these risk analyses, instituted controls, and BRP test results for one year and make these records available to University Audit.
The application administrator shall:
- perform, at least annually, risk analyses to determine the level of exposure to data systems under their responsibility to corruption, damage or other disruption. Special attention should be given to those systems designated as critical or impacting.
- institute controls (e.g., proper backup plans, formalized restart procedures, installation of an uninterruptible power supply, etc.) which will minimize the probability that a disruption will occur and ensure quick business resumption when a disruption does occur.
- report the results of the risk analyses and instituted controls (noted in items a and b above), to the appropriate Campus/Common Administrative Services risk manager.
- maintain detailed records of risk analyses they perform and documentation of instituted controls for one year. These records/documentation should be made available to University Audit.
- develop, document, update and test business resumption plans (BRPs) for those systems designated as critical or impacting. This includes obtaining permission from software vendors to use licensed software products for testing and recovery operations.
- perform, at least annually, BRP tests and keep the BRP(s) current. Results of BRP testing should be reported to the Campus/Common Administrative Services risk manager. Ongoing changes in systems, software, applications, communications and operations will create many changes and updates to the plan.
Application administrators shall perform risk analyses (See Appendix 1 for a sample Risk Assessment Checklist) and review controls over the data system(s) under their control as part of any implementation of a new critical or impacting data system, and annually for all existing critical or impacting data system. This will ensure that appropriate and up-to-date controls are in place. University Audit or external audit firms are good resources for application administrators interested in obtaining additional or more technology specific audit/assessment checklists.
Based on risk analyses performed by application administrators, the administrator shall institute controls (e.g., proper backup plans, formalized restart procedures, installation of an uninterruptible power supply, etc.) which will minimize the probability that a disruption will occur and ensure quick business resumption when a disruption does occur. The costs of implementing these controls should be weighted against the loss which would result if the disruption occurred (this is referred to as risk management) and the probability of a disruption.
Business Continuity Planning is the process of identifying critical data systems and business functions, analyzing the risks of disruption to the data systems and business functions, determining the probability of a disruption occurring and then developing business resumption plans (BRP's) to enable those systems and functions to be resumed in the event of a disruption.
The goal of an effective business resumption plan and recovery process is to facilitate and expedite the resumption of business after a disruption of critical or impacting data systems and operations has occurred. Disruptions may be minor or may include instances where normal University functions and services cannot be performed and may not be performed for an extended period of time (see Appendix 2 for an Example Of Potential Risks That May Result in Data System Disruptions). Business continuity planning (see Appendix 3 for Sample Business Continuity Planning Steps and Issues) minimizes the impact of disruption while maximizing resources available to resume normal operations. The principle objectives are to:
- Minimize disruptions of service to the University community and any external entity relying on University data systems and the information stored in them.
- Provide a road map of predetermined actions that will reduce decision-making during recovery operations. Good planning will reduce the number and magnitude of decisions that must be made during the period when exposure to error is at a peak.
- Ensure the timely resumption of critical and impacting systems and enable the resumption of normal business/service at the earliest possible time in the most cost-effective manner.
- Limit the impact of the disruption on the University mission and reputation, and limit any financial losses.
Once critical and impacting University applications have been identified and ranked, BRPs for these applications shall be developed. Copies of the BRPs shall be accessible from any off-sight location. Key personnel should know the exact location of the BRPs and be familiar with how to access this information. BRPs should be maintained electronically in relational database software, when possible.
BRPs for critical and impacting systems shall contain the following (See Appendix 4 for Sample Outline of a Business Resumption Plan):
- Clarification of what constitutes a disruption (what level/extent of disruption) for which the specific BRP needs to be implemented.
- Maximum acceptable downtimes that can be incurred (i.e., how long the unit/University can function before the data system must be available). Business functions and/or services which must be restored within 2 - 4 hours require significantly different recovery actions than those which can be delayed a number of hours, days or weeks (See 5 for Sample Data System Criticality Levels).
- Who determines whether the incident is classified as a business disruption, what level of disruption has occurred and to what degree the plan needs to be implemented. When a disruption occurs, the level and extent of the disruption must be immediately determined and appropriate steps taken to safeguard lives and prevent further destruction or escalation of the problem.
- Which staff are involved in the business resumption effort (part of the resumption team - See Appendix 6 for sample BRP Recovery Teams and Responsibility Assignments) and at what disruption level are they involved.
- What are the resumption team member responsibilities (See Appendix 6 for Sample BRP Recovery Teams and Responsibility Assignments) and how will the non-availability of certain key team members be addressed. Step-by-step, definitive procedures (See Appendix 7 for Sample BRP Recovery Steps Document) for each team member shall be developed. Plans for cross training on duties should also be formulated. Pre-planned processes and trained personnel will significantly reduce the cost and time necessary to achieve full recovery and resume normal business operations.
- Contact names, phone lists and initiation procedures that are updated quarterly or as needed. An emergency call list for key personnel (See Appendix 8 for Sample Key Personnel Emergency Call List) shall be developed. Additionally, procedures shall be developed with other administrative functions that may be effected, such as, Human Resources, Public Safety, Public Relations and data center personnel, etc.
- The location of the BRP coordination site, if needed.
- What information about the disruption shall be made public and how this information will be disseminated.
- An inventory (See Appendix 9 for Sample Asset Inventory) of all critical resources necessary to resume processing including, but not limited to:
software (systems and applications),
communication requirements (front-end processors, lines, modems, etc.)
physical site requirements for an alternate facility, including air conditioning, power, raised floor, cabling, communications, total square footage, personnel and office space needs, etc.
hardware and peripherals (e.g., PCs, printers, etc.)
Data files (note format - MAC, DOS, Filemaker Pro, etc.)
Security - this should include any modifications to physical, data, and networks needed
to allow the resumption team members to implement the BRP.
Office equipment (e.g., telephones, copiers, typewriters, fax machines, etc.)
Storage for supplies, forms, etc.
Funding and acquisitions - funding needed to implement the plan and the source of funding. This should also include a provision for incidental costs so that small needs do not hamper the resumption effort.
Transportation logistics (trucking, packing services, etc.) for personnel, supplies, input/output delivery between critical system users and the recovery site, and between the recovery site and the back-up facility.
These lists shall contain quality and quantity requirements (e.g., version 3.0 or higher of software x, 15 PC's with windows 95 software, 100 copies of form X or 8 1/2" x 11" paper, etc.).
- Data back-up schedules and off-site storage procedures. Keep current schedules and back-ups for all critical/impacting systems at off-site storage locations. Special back-up and restore procedures should enable loading only the most critical items.
- Contracted or agreed upon alternate facilities/operating sites, if appropriate (see Appendix 10 for Sample Alternate Site/ Processing Contract Issues). These may be hot, or cold sites, service bureaus or shared sites (i.e., reciprocal agreements) depending on the degree of disruption and need. Copies of any reciprocal agreements, or service bureau or hot/ cold site contracts should be kept at an off-site location.
- Information regarding the type and level of hardware and software vendor support required, available and contracted. This should include any necessary purchases or leases needed in the event of a disruption (e.g., office, communications and/or computer equipment, etc.), estimated costs of specific support, payment arrangements, and vendor response times. Information should be obtained through meetings, requests for information, acquisition terms and conditions, joint vendor meetings, etc. as part of the business continuity planning efforts.
- Hardware and software (system and application) restore procedures.
- The off-site location of data (whether paper, tapes, cassettes, disks, etc.), duplicate copies of documentation (BRP, system/application manuals, contracts, procedure manuals, etc.), supplies, and forms.
- A schedule for BRP testing (See Appendix 11 for Sample BRP Testing Issues). BRPs shall be tested at least annually using various testing approaches (e.g., structured walk-through; checklists; simulations; parallel testing; and full-interruption testing). Tests should be carefully planned to minimize disruption to normal operations and should address partial and full disruptions of various types. Each test scenario should be carefully developed so that all facets of the BRP are fully tested. Planning and conducting test exercises should be the joint responsibility of the data center, application administrators and the user(s). Some areas to test include, but are not limited to:
. Data backup
. Documentation backup
. Facilities backup
. Resumption team training
. Critical applications (first singly, then in groups) . Response during different processing periods and shifts
. Alternate processing procedures
- Procedures for documenting formal plan tests and test results, following up these tests, and implementing corrective actions/recommendations arising from these tests. After each test exercise, results should be thoroughly reviewed for flaws, omissions, and overlaps in the business resumption procedures. Test results should be made available to the risk manager and University Audit.
University and external audit shall review campus procedures and compliance with these Business Continuity Planning Guidelines.
The PDF files on this page require the free Adobe Acrobat Reader.