Is your organization prepared to survive a disaster that affects the availability and access to critical information?
The characteristics of today's IT infrastructures have made the recovery of information after a certain incident or disaster increasingly complex.
However, the greatest risk lies in the fact that the loss of information often implies a service interruption, the consequences of which can be devastating for the affected business.
Our intentions with this post is that you are clear about the factors that intervene when carrying out an information recovery plan. In this way, you can control what aspects are or are not controlled in your organization, and what to do to be prepared for a possible disaster that has unpredictable consequences.
Phases to develop a DRP (Disaster Recovery Plan)
One of the best starting points to develop the design of a Disaster Recovery Plan is to carry out a complete inventory of all the digital assets of the organization. In this way, you will be able to properly assess the complexity and risks present in the IT environment.
First, you need to identify the physical sites where IT services operate, as well as the information systems that each facility owns.
Lists a complete listing of all servers, storage devices, applications, switches and network devices, access points, data, etc.
It is also very useful to create a map in which you can see the physical location of each of these assets, as well as the network in which they are located.
Assess the risks
Once you have a global vision of the organization's digital assets and their specific location, it is convenient to determine all the internal or external threats that may affect them.
Keep in mind that the causes that can generate a problem of access and conservation of information are very varied and are not only reduced to natural disasters.
According to a survey carried out by the consulting firm Forrester in 2013, 43% of companies suffer a disaster due to a failure in the power supply. In 31% of the companies surveyed, an IT hardware error is also one of the main causes.
The most delicate aspect of this process is the calculation of the probability that these events will take place and the impact that these would have on the business.
To do this, it will help you to have the support of the main managers of each department since it is they who can assess the damage they would suffer in the event of a possible loss of information or disablement of a certain application.
Determine the criticality of applications and data
It is also necessary to classify information systems according to how critical they are for the continuity of the company's activity.
Of course, it is not a matter of applying a different technique or criteria to each of the applications and databases. Our advice is to group them based on their importance, how often they change, and corporate information retention policies. Again, the opinion of the departmental managers will be essential.
Define recovery goals
Because each organization and area will imply different objectives. A legacy system tends to host less changeable content, and therefore recovery goals may be less demanding. However, a bank's database cannot afford to remain inoperative for a minute.
To define optimal recovery goals, each department will need to answer a series of questions, such as:
What applications and databases do they use and how often?
How long can they remain operational without having access to such systems?
What implications would the definitive loss of this data have?
Is there any type of requirement that prevents certain databases from being re-housed in a geographical location other than the current one?
Is there any type of requirement regarding the levels of security and data encryption?
Determining the RTO (Recovery Time Objective)
One of the most practical ways to determine the RTO is to take into account the loss of income that your company would suffer based on a specific period. Each organization must choose its scale given that, while in some cases 24 hours is a valid time unit, in other cases, each hour the system goes down means millions in losses.
Determining the RTO will help you to more efficiently choose the functionalities and services that you should implement in your company's backup system. Having a simple recovery system through tapes is not the same as using a host-based replication solution.
Determination of the RPO (Recovery Point Objective)
The RPO indicates how much information your organization can afford to lose. This level of tolerance can lead to very wide or incredibly tight time frames. Whatever your case, the chosen RPO will determine, in turn, the frequency with which the backups should be done.
Choose the appropriate techniques and tools
Now that you have a complete and mapped inventory of information systems, as well as a rating of their importance and respective recovery objectives, it is time to choose the tools necessary to guarantee those objectives.
A balance needs to be struck between protection needs and available financial resources.
Thus, those data whose loss does not have a great impact on the operation of the business can be backed up with nightly backups, using traditional file-based methods; something inadmissible for information of a priority nature, which will need a higher level of protection and guarantee.
In the case of complementing security with offsite backup systems, the location of the facilities must be in a different geographical area from the organization itself, to minimize risks in the event of natural disasters or local disasters.
On the other hand, the automation of recovery processes should be an essential requirement in any of the solutions chosen, since IT managers cannot be counted on to be available at all times.
Assign roles and responsibilities
In addition to the development of the Disaster Recovery Plan per se, the roles involved and the responsibilities that each one assumes must be defined.
Involve all stakeholders
The creation of a disaster recovery system should not be limited to the data center and the IT department. All stakeholders have to contribute their point of view during planning, and most importantly, they must agree with the SLAs and priorities established by the IT team.
Document and communicate the DRP
The entire information retrieval strategy must be documented, to ensure that the protocols that have been defined are preserved, and to facilitate internal communication. The drafting of such a strategy should be left to the people in charge of executing it.
Put the DRP to the test
Every DRP should be tested at least annually. Unfortunately, according to the report " Disaster Recovery as a Service (DRaaS) Attitudes & Adoption " published in 2016, 22% of the companies surveyed do not carry out any type of test or do it for periods of more than one year.
The documentation mentioned above must specify the procedure to be executed, as well as the frequency of the tests.
These tests help determine the compatibility of established procedures, identify areas that require some kind of change, and, of course, train employees. In the event of detecting any anomaly, the DRP will have to be updated.
The Disaster Recovery Plan may also require modifications despite its proper operation. This takes place when there are substantial changes in the organization itself, which end up affecting the RTO and RPO of the same.
It may be that the IT infrastructure is migrated to another hardware or operating system, that a new division is created or acquired within the company, or that certain employees with high responsibility leave the organization.
The preparation of a Recovery Plan before Disasters indeed requires considerable dedication and the allocation of several resources. However, it is a critical factor in ensuring the survival of a business.
Providing organizations with this important resource is today an unavoidable element that every compliance Team must take into account when managing or hiring a backup service.