Here at UKFast we pride ourselves on being able to offer the best solutions in the UK and Something we are being asked about more and more is Disaster Recovery, a subset of Business Continuity.
With the increasing prevalence of business critical information technology systems, combined with the transition to a 24×7 economy, the continuity of service and importance of protecting an organization’s data and IT infrastructure in the event of a disruptive situation has become an increasingly more visible business priority in recent years.
assessing DR priorities
To fully understand the level to which we should be protecting our IT against disasters, we need to cover a few concepts related to the business processes which rely upon them. These business processes may facilitate many different facets of our businesses. The website which facilitates our organisation’s sales or marketing presence, the email system which enables communication between our teams, the SQL server which contains our business intelligence – the list goes on. Each of these different IT systems ties in to the business process and hence we can attach a value to each. This value determines the acceptability of loss of the service and from this, it is possible to calculate a cost per unit of time that this service or business process is unavailable. Equally, value must be used to calculate the amount of acceptable irrecoverable data loss from a system. The difficult thing can be calculating these and it is often the case that these issues are addressed in reverse; companies wanting a Disaster Recovery solution without truly knowing the impact (cost) of the loss of their services/systems/business processes. When this occurs, it is important to bear in mind that whilst the best Disaster Recovery solution will have minimum data loss and also have you back up and running in very little time (if you are offline at all) – the costs associated can be astronomical.
To help us understand this better, two key concepts are introduced which underpin the cost model for Disaster Recovery.
Recovery Point Objective (RPO)
- The amount of data loss, expressed by an amount of time, which is acceptable in the event of a disaster. e.g. An RPO of 5 hours. A backup of data is taken at 1pm and the next is scheduled for 6pm, any disaster occurring and affecting the original data will result in a maximum of 5 hours loss of data up until 5:59pm.
Recovery Time Objective (RTO)
- The amount of time, set as an objective (not a mandate) to which attempts are made to restore service to the business following a disaster. This is defined by the business following detailed analysis of how long the business should be capable of surviving without a service/system/business process. E.g. an RTO of 5 hours. A disaster occurs at 10am. Services should be restored by 3pm.
Short RPO and RTO
It is clear to see from the diagram above that to implement a solution which requires short RPO and RTO the costs are high. To give an example of why this is the case, we will refer to the case of the sales website. On the basis that the site is critical to the operation of the business (i.e. taking orders and payments for products to be shipped from a warehouse) we can assume a low RPO and RTO requirement.
To enable this, we might geographically disperse the IT which is providing the service across two or more datacenters, mitigating failure of one entire datacenter. This is a costly option because it requires twice the amount of investment in the IT hardware and setup costs – of course, it is also a very unlikely occurrence to lose an entire datacenter.
Short RPO, Long RTO
To implement a solution which requires short RPO and long RTO the costs are lower. An example in this case may be a website where the data might be constantly updated with important changes but is only referred to at certain times of the day. If the solution is offline and the changes can be applied once it is back up and running it is possible to afford a longer RPO and hence a lower cost.
Long RPO, Short RTO
Like the previous (Short RPO, Long RPO) scenario, to implement a solution which requires long RPO and short RTO the costs are lower. An example in this case may be a website where the data collected (if any) is not critical to the operation of the business, however the fact that the site is continually ‘up’ is crucial.
Long RPO and RTO
To implement a solution which can afford long RPO and RTO the costs are low. For websites that are deemed non critical or where budget cannot stretch, this approach to DR is suitable – especially given UKFast’s 100% uptime guarantee and hardware replacement guarantee.
There are many varied methods for implementing Disaster Recovery solutions for businesses however large or small. Basics such as dual power supplies or resilient disks in servers will help to stem the issues of hardware failure. More advanced solutions will involve multiple-datacenter configurations, virtualisation and clustering to provide better and better RPO/RTO combinations. The key is not to forget about the potential for disaster and the potential effect on business – but to assess risk, prioritise resources and to build as recoverable a solution as a given budget will allow.
This is where it is important to note that IT budgets will be one of the primary influences on the choice of RPO and RTO for any business.
The UKFast team are positioned to assist with any enquiries with regards to how Disaster Recovery solutions can assist your business IT.