How well-equipped are you to continue operating in the event of an emergency? Answer these questions to find out.
Disaster recovery – the ability to get your critical IT systems back up and running in the event of an emergency – is like insurance. In an ideal world, you would never need it. But in reality, things do go wrong. And when they do, you need to know you have got insurance you can rely on.
Despite being a critical part of any business continuity plan, disaster recovery (DR) does not always get the same attention as other areas. So to make sure your organisation has its finger on the pulse, we have put together some big questions for you to think about, which will help you assess how well-prepared your organisation is to respond to an emergency.
1. Have you identified your mission-critical applications and IT services?
Think about all the applications that enable your organisation to operate. To continue functioning following an emergency, what is the minimal capability you would require?
Some applications will be obvious, but think carefully about any that may be less apparent, or whether supporting systems are required for the primary estate to work.
For example, your email marketing platform may not sound like a mission-critical application. However, if customer communication is a key part of your business continuity plan, and the email system is your chosen delivery tool, it too needs to be considered mission-critical.
Equally, if a mission-critical application relies on data from another system, both need to be available.
2. How long could your organisation operate if you lose access to your mission-critical applications?
The answer to this question will define your maximum acceptable recovery time, and hence what kind of DR capability you require and how you resource it. Remember, the maximum acceptable recovery period may vary depending on the day or time.
For some organisations, even half an hour without their critical systems in the middle of the night could have severe, potentially life-threatening consequences. For others, the response may not need to be as quick.
Most importantly, whatever disaster recovery capability you have, you need absolute confidence that it will get your mission-critical systems working again within the timeframe you’ve set. If not, your organisation may cease to be viable. How confident are you that your current capability can deliver?
3. Is your disaster recovery approach for these mission-critical services unnecessarily expensive or risky?
There are different ways to operate disaster recovery. The traditional challenge for IT teams has been balancing the cost, risk and effectiveness of the capability.
Low-cost DR approaches, such as taking backups of your mission-critical systems, which you can then attempt to restore, are inherently risky. What if the backup tapes do not work in your DR infrastructure? What if the lorry transporting your tapes or replacement servers breaks down or gets held up in traffic?
Alternative approaches that cut the risk and accelerate recovery time are typically expensive. Running multiple sites, for example, either in active-passive or active-active configuration, can be hugely expensive, particularly if the second site sees relatively little use.
Do you feel your current approach strikes the right balance between cost, risk and effectiveness?
4. Is your disaster recovery capability well-maintained and adequately resourced?
A lot gets asked of IT teams: troubleshooting, delivering new services, keeping the main production systems running… these ‘live’ requirements typically take precedence over making sure the disaster recovery capability is in full working order.
Is the backup regularly checked, to ensure the data it contains is up-to-date, secure and not corrupted, for example?
If you need to be able to recover outside of normal working hours, have you got the teams and working arrangements in place to do this?
5. When did you last test your disaster recovery capability?
Having DR capability is one thing, but knowing for certain that you can invoke it and get those critical systems running again is another. And like other elements relating to DR, testing the setup can often drop down the list of priorities when other, more immediate issues arise.
And yet testing is absolutely essential. The easiest approach is through paper-based exercises. These are simpler and cheaper than real-life tests but will not pick up certain shortcomings.
This is why you should also run real-life tests. These can be partial, such as your IT team making sure they can restore backups onto new servers. But remember that partial tests will leave questions unanswered: can you then make these newly restored services available to your users, for example?
At the other end of the scale, a full, company-wide real-life test will give you a much more realistic view of how well your organisation will respond to an emergency. This type of test is complex and can be disruptive to business-as-usual, but there are ways to mitigate this. We advise all our customers to do a full test at least once a year.
6. Have you considered a managed, cloud-based DR service?
A smart way to tackle the issues we have raised is to operate your DR capability as a managed service in the cloud, known as Disaster Recovery as a Service, or DRaaS.
DRaaS is an extremely cost-effective way to achieve the assurances you need around service availability, recovery time, backup integrity and having a team on-hand 24×7 to monitor the environment and support your organisation in testing and getting back on its feet in the event of an emergency.
This managed cloud approach eliminates the need to run an expensive and often under-used second estate because you will only pay for the cloud servers when you actually invoke DR. It removes the risk of relying on tapes or infrastructure that must be delivered from other locations because you can connect to the cloud from anywhere. And with the managed services partner contractually obliged to make sure the DR environment is in working order, and make it available within the agreed timeframes, it is one less thing for stretched in-house IT teams to have to worry about.
In short, DRaaS is the ultimate insurance policy and will help everyone associated with business continuity sleep better at night.