Understanding Data Replication and Application Recovery

white paper
Understanding Data Replication and Application Recovery
There are several aspects that influence the choice of replication technology between primary and remote site. The main considerations is the recovery point objective. Other important
points that influence are the distance between sites, management complexity, data consistency and cost of the solution. Sanovi DRM software interfaces with popular technologies to
monitor and automate application recovery using various replication methods. This paper
discusses the various influences between the replication methods and application recovery.
Introduction
Disaster Recovery solution is the coming together of process and technology.
The solution must be designed to meet two key recovery metric, Recovery Point
and Recovery Time. The choice of data replication is an important ingredient of
the DR solution. It is primarily influenced by the Recovery Point required of the
application. We take typical application Recovery Point Objective and discuss
replication technologies that are suitable to meet these objective.
What is Recovery Point Objective (RPO)?
Recovery Point for an application, is the amount of data that the business can
afford to loose when an outage occurs. As an example, if the Recovery point of
a core banking application is zero secs, then the DR solution must be designed
such that no data is lost in case the primary application becomes unavailable.
Any data that is recorded on the primary system must be available at least on
one remote system almost instantaneously in order to meet RPO of zero seconds.
Recovery Point is Zero
This means that no data must be lost. Any data that is recorded on the primary
system must be available almost instantaneously on one or more remote systems. Synchronous block based replication is the preferred method to fulfil this
requirement. Popular synchronous replication solution guarantee that if data is
recorded on the primary disk, it will also be available on the remote system disk.
Production application may have to wait till the acknowledgment that the write
on the remote system has completed before the application can move forward.
This is an important consideration for the design of a DR solution since it impacts application performance and scalability. This also requires that network
bandwidth must be guaranteed to meet peak traffic loads. Synchronous replication technologies that are available are storage array based or host based.
Recovery Point is Near Zero
Applications that can afford to loose only up to a few minutes of data need
near zero DR solutions. The replication technology recommended to meet this
requirement is asynchronous block based replication. As the name implies, in
asynchronous replication, data is written to the primary system disk while in
parallel, it is sent over to the remote system. The primary production application does not wait for the write acknowledgement that data is written to the
remote system. For asynchronous replication, network bandwidth can be designed to meet average network traffic. Asynchronous replication is available
on storage array, host based and network based replication devices.
Page Understanding Data Replication and Application Recovery
Recovery Point is greater then 30 minutes
Applications that can tolerate data loss of more then 30 minutes have more options when selecting the replication technique. While block based replication
can be used, often file based replication technologies suffice. Off-the-shelf databases like Oracle, MSSQL and Sybase lend themselves well to using file based
replication. Database log files can be dumped on the primary server on a periodic basis, these log files are replicated to the remote site for reconstruction of
the database on the remote site. There are other considerations that must be
thought through before choosing file based replication. The application must
lend itself to working with files cleanly, replicating open files is not advisable.
The number of files and size of files to be replicated and CPU usage that host
based file replication requires needs to be accounted for.
Management Complexity
An important consideration when choosing a replication technology is its ease
of management. How easy is it to start, stop & monitor replication. Block replicators by their nature are more involved and require training & expertise to design the right solution, setting it up and monitoring its health and performance.
Most of the block replication software support a command line interface, additional software may be required to monitor and manage it with a GUI. Recovery
of applications using block replication usually requires several steps to ensure
data is consistent before it is made available to the application. Block based
replication report data lag between primary and remote in blocks or tracks. For
this to be meaningful to the application, it must be translated to application
recovery point, this is usually not an easy and obvious computation.
File replication must be faithful to file attributes and address security requirements. An important overhead for file replication is disk space management.
The amount of available space on the primary and remote systems must be actively monitored, else replication can fail when disk run out of space.
Data Consistency - Application view
Having a consistent copy of data is a very important requirement that must be
understood and designed for when using a replication technology. We can understand this better with an example. The contents of a database is typically
made up of several data files. The data files reside on multiple disk volumes.
When using block replication, the DR solution must be designed such that, it
ensure that all of the disk volumes are replicated and are consistent with respect
to each other before presenting it to the application. If file based replication is
in use, the DR solution must ensure that all the files that make up the database
contents are available and are consistent with respect to each other before being presented to the application. Getting to a consistent state when using file
replication is usually easier to implement. With block based replication, a pointin-time copy of the volume and check to ensure that data is consistent may be
required before presenting the data to the application for recovery.
Applications with inbuild Replication Support
Application vendors are now supporting provision where replication of data is
built into the application. Example of this is Oracle has a feature called DataGuard for replication of its database. Using built in replication has the advantage that data consistency is inbuilt, hence the recovery procedure is usually
simpler. The disadvantage with approach is that it applies only to a specific
application, it will not be possible for an enterprise that operates a heterogeneous environment to standardize on a replication technology.
Page Understanding Data Replication and Application Recovery
Sanovi DRM File replication (PFR)
Sanovi’s product come packaged with a file replication software. Use of file
based replication is recommended for applications when Recovery Point is thirty minutes or greater. It is also very effective when application environment &
configuration files have to be replicated. The advantages of using file replication
is that it is simple to operate and manage. PFR is fully integrated with Sanovi’s
monitoring and management suite and has built in alerts on disk quotas on the
primary and remote systems. It supports include / exclude files list to control the
file types that gets replicated. Sanovi’s PFR does not replicate open files.
Summary
Replication is a important ingredient of a DR solution. Recovery Point influences the choice of the replication technology. The ability to monitor replication
status, automate start, stop and create consistent data copies must lend itself
to easy automation. Application recovery can be made more predictable and
easy to design and deploy with the right choice of replication technology.
About Sanovi
Sanovi helps organizations across the globe pro actively manage Disaster Recovery (DR) environments
and ensure business managers that applications can be recovered in compliance with service level agreements. Sanovi DRM, is a comprehensive family of enterprise-class DR management software solution for
validating, monitoring, testing and automating recovery. For more information visit www.sanovi.com
Page 3