The availability of an organisation’s IT infrastructure is of vital importance for supporting business activities. IT outages are a cause of competitive liability, chipping away at a company financial performance and reputation. To achieve the maximum possible IT availability within the available budget, organisations need to carry out a set of analysis activities to prioritise efforts and take decisions based on the business needs. This set of analysis activities is called IT availability planning. Most (large) organisations address IT availability planning from one or more of the three main angles: information risk management, business continuity and service level management. Information risk management consists of identifying, analysing, evaluating and mitigating the risks that can affect the information processed by an organisation and the information-processing (IT) systems. Business continuity consists of creating a logistic plan, called business continuity plan, which contains the procedures and all the useful information needed to recover an organisations’ critical processes after major disruption. Service level management mainly consists of organising, documenting and ensuring a certain quality level (e.g. the availability level) for the services offered by IT systems to the business units of an organisation. There exist several standard documents that provide the guidelines to set up the processes of risk, business continuity and service level management. However, to be as generally applicable as possible, these standards do not include implementation details. Consequently, to do IT availability planning each organisation needs to develop the concrete techniques that suit its needs. To be of practical use, these techniques must be accurate enough to deal with the increasing complexity of IT infrastructures, but remain feasible within the budget available to organisations. As we argue in this dissertation, basic approaches currently adopted by organisations are feasible but often lack of accuracy. In this thesis we propose a graph-based framework for modelling the availability dependencies of the components of an IT infrastructure and we develop techniques based on this framework to support availability planning. In more detail we present: 1. the Time Dependency model, which is meant to support IT managers in the selection of a cost-optimal set of countermeasures to mitigate availability-related IT risks; 2. the Qualitative Time Dependency model, which is meant to be used to systematically assess availability-related IT risks in combination with existing risk assessment methods; 3. the Time Dependency and Recovery model, which provides a tool for IT managers to set or validate the recovery time objectives on the components of an IT architecture, which are then used to create the IT-related part of a business continuity plan; 4. A2THOS, to verify if availability SLAs, regulating the provisioning of IT services between business units of the same organisation, can be respected when the implementation of these services is partially outsourced to external companies, and to choose outsourcing offers accordingly. We run case studies with the data of a primary insurance company and a large multinational company to test the proposed techniques. The results indicate that organisations such as insurance or manufacturing companies, which use IT to support their business can benefit from the optimisation of the availability of their IT infrastructure: it is possible to develop techniques that support IT availability planning while guaranteeing feasibility within budget. The framework we propose shows that the structure of the IT architecture can be practically employed with such techniques to increase their accuracy over current practice.
|Award date||20 Jan 2011|
|Place of Publication||Enschede|
|Publication status||Published - 20 Jan 2011|
- Availability Information Risk Management Business Continuity Service Level Management