Abstract
The availability of an organisation’s IT infrastructure is of vital importance
for supporting business activities. IT outages are a cause of competitive liability,
chipping away at a company financial performance and reputation. To achieve
the maximum possible IT availability within the available budget, organisations
need to carry out a set of analysis activities to prioritise efforts and take decisions
based on the business needs. This set of analysis activities is called IT availability
planning.
Most (large) organisations address IT availability planning from one or more
of the three main angles: information risk management, business continuity and
service level management. Information risk management consists of identifying,
analysing, evaluating and mitigating the risks that can affect the information processed
by an organisation and the information-processing (IT) systems. Business
continuity consists of creating a logistic plan, called business continuity plan,
which contains the procedures and all the useful information needed to recover
an organisations’ critical processes after major disruption. Service level management
mainly consists of organising, documenting and ensuring a certain quality
level (e.g. the availability level) for the services offered by IT systems to the business
units of an organisation.
There exist several standard documents that provide the guidelines to set up the
processes of risk, business continuity and service level management. However, to
be as generally applicable as possible, these standards do not include implementation
details. Consequently, to do IT availability planning each organisation needs
to develop the concrete techniques that suit its needs. To be of practical use, these
techniques must be accurate enough to deal with the increasing complexity of IT
infrastructures, but remain feasible within the budget available to organisations.
As we argue in this dissertation, basic approaches currently adopted by organisations
are feasible but often lack of accuracy.
In this thesis we propose a graph-based framework for modelling the availability
dependencies of the components of an IT infrastructure and we develop techniques
based on this framework to support availability planning. In more detail
we present:
1. the Time Dependency model, which is meant to support IT managers in the
selection of a cost-optimal set of countermeasures to mitigate availability-related
IT risks;
2. the Qualitative Time Dependency model, which is meant to be used to systematically
assess availability-related IT risks in combination with existing
risk assessment methods;
3. the Time Dependency and Recovery model, which provides a tool for IT
managers to set or validate the recovery time objectives on the components
of an IT architecture, which are then used to create the IT-related part of a
business continuity plan;
4. A2THOS, to verify if availability SLAs, regulating the provisioning of IT
services between business units of the same organisation, can be respected
when the implementation of these services is partially outsourced to external
companies, and to choose outsourcing offers accordingly.
We run case studies with the data of a primary insurance company and a large
multinational company to test the proposed techniques. The results indicate that
organisations such as insurance or manufacturing companies, which use IT to
support their business can benefit from the optimisation of the availability of their
IT infrastructure: it is possible to develop techniques that support IT availability
planning while guaranteeing feasibility within budget. The framework we propose
shows that the structure of the IT architecture can be practically employed with
such techniques to increase their accuracy over current practice.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 20 Jan 2011 |
Place of Publication | Enschede |
Publisher | |
Print ISBNs | 978-90-365-3102-3 |
DOIs | |
Publication status | Published - 20 Jan 2011 |
Keywords
- IR-75680
- EWI-19425
- CR-C.2.0
- SCS-Cybersecurity
- METIS-277510
- Availability Information Risk Management Business Continuity Service Level Management