Background – How bad is your Nightmare Scenario?
In the modern world, losing access to data and systems can cut the value of a company in half in a matter of moments. Much of a company’s value is dependent on its technology systems and data, which makes protecting that infrastructure a task of paramount importance. As a result, simply asking “Are the backups working?” is no longer sufficient to effectively manage your IT risk.
Why? Because if systems go down, tiny changes in configuration can cost or save you millions of dollars.
To begin, let’s define the key terms: Backup, Disaster Recovery and High Availability.
Definition: “Backup”
Backups are simple; they are copy of your corporate systems and data that can be used to bring a failed system back online. However, backups do not necessarily include the infrastructure to restore to – you may have a copy of the data or systems, but no infrastructure to run those systems or process that data. If have your systems backed up to disk, tape, or the cloud but it may be very time consuming to utilize these backups if you have a need.
Definition: “Disaster Recovery” (DR)
DR refers to a more advanced form of system copies that include processing capabilities. If you have a disaster you should be able to bring your systems back online using your DR platform.
Definition: “High Availability” (HA)
High Availability, or “HA: typically refers to local system redundancy. A HA platform has at least two of everything so that a secondary system can take the place of a primary system if the primary fails. HA systems are much harder to bring down, as they can typically take the failure of any single component in stride.
Putting it Together
Backups are great for file restoration required by a mistake or corruption. DR is used to bring your company back online if there is a fire, flood, or other disaster. HA covers for a downed server or a downed network device.
So what does this mean from a business context? There are two additional concepts:
- Recovery Point Objective (RPO)
- Recovery Time Objective (RTO)
Recovery Point Objective (RPO)
RPO refers to the point in time to which your systems will be back dated if they are brought back online. An RPO of 24 hours typically identifies a system that is backed-up or replicated every 24 hours. As a result, if you lose a system you should have a copy of that system’s data from 24 hours ago.
Recovery Time Objective (RTO)
RTO refers to how long it will take your IT department to bring the systems back online – how long until your users are working again.
For example, an RTO of 24 hours combined with and RPO of 24 hours indicates that a system failure on Wednesday at noon will likely get you back working on Thursday at noon using Tuesday’s data.
“If we combine the concepts of backup, DR, and HA with RPO and RTO we can start to manage our expectations and consequent risk.” – CVM
Skeletons in the Closet
When CVM examines a new infrastructure we are often the bearers of bad news. Too many corporate leaders have not been made aware that their daily system backups may take a week to utilize (RPO = 24 hours, RTO = 7 days). The data may be in the cloud, on disk, or on tape; but, if the systems are not in place to process this data then the situation may be dire.
Do you have questions about backups and disaster recovery? Drop us a line, and we can schedule a complementary phone call to answer your questions and help you ensure your bases are covered.[/fusion_builder_column][/fusion_builder_row][/fusion_builder_container]