Gathering CDP Info for My Tome on DR

Here is a round-up I have put together around the current state of CDP.  I would welcome any additions or corrections.

Continuous Data Protection (CDP)

Simple definition:  replication of change data on a routine basis to alternative (backup) media.  Purists claim that CDP differs from traditional backup in that restoral of data to a particular system or application state is more granular (i.e., you can restore a database to a selected point in time).

Some categories:

1/  Asynchronous (Copy After Write)

  • Backup:  write data, then copy it on a routine basis to alternative media
  1. Full Backups:  copy all data to alternative media on a routine basis
  2. Incremental Backups:  copy only changed data to alternative media, then post to full backup copy
  • Snapshots:  capture file system state on a routine basis, then copy to alternative media
  • Mirror split imaging:  Establish a media mirror.  At predetermined intervals, quiesce application, flush storage cache to mirror media (to obtain a complete data set with “crash consistency”), time stamp the mirror media indicating time of write, break mirror pair, set time stamped media aside, apply fresh media to serve as mirror, restart application.  Repeat at regular intervals.
  • “Application Focused” or “Application Aware” Journaling:  Completed file writes or database transactions echoed to backup media.
  • Block/Byte Change Data Replication:  Local mirrored media assessed for changes at the block or byte level on an on-going basis and at some predefined interval changed blocks or bytes are copied to alternative media.  Given an intact full copy or mirror of primary data, the data set can be dialed forward or backwards to a specific point in time to re-establish a full data set.  Prior to its acquisition by Symantec, Revivio delivered this as Time Addressable Storage (TAS) – a cheaper alternative to Point in Time Mirror Split Imaging, since less expensive disk can be used to hold the change data than the disk used to capture data originally.   

2/  Synchronous (Copy On Write)

  • WAN-based Mirroring – same data written to two targets in real time. 
    Depending on the distance between write targets, remote writes may be acknowledged as complete prior to actually writing the mirror data to the remote target (“spoofing”) so as not to “hold the channel high” or to degrade application I/O performance. 
  • WAN-based Mirroring with CDP
    Used in conjunction with CDP at the byte or block level, WAN-based mirroring can be said to optimize WAN bandwidth by moving only changed data from the production site to the recovery site.  Depending on the implementation, however, the price for such a solution can be prohibitive due to hardware vendor lock-ins (i.e., only vendor X’s equipment can be used with the software necessary for CDP operations and for remote mirroring).

Summary of Leading Products

1/  CDP at the Operating System/File System Level

Microsoft Systems Center Data Protection Manager (DPM)

DPM is a CDP solution leveraging Redmonds Shadow Copy technology (Volume Snapshot Service or VSS).  Through the integration between the Volume Shadow Copy Service, hardware or software VSS providers, application level writers and backup applications, VSS enables integral backups that are point in time and application level consistent without the backup tool having knowledge about the internals of each application…The end result is similar to a versioning file system, allowing any file to be retrieved as it existed at the time any of the snapshots was made. Unlike a true versioning file system, however, users cannot trigger the creation of new versions of an individual file, only the entire volume. As a side-effect, whereas the owner of a file can create new versions in a versioning file system, only a system administrator or a backup operator can create new snapshots (or control when new snapshots are taken), because this requires control of the entire volume rather than an individual file. Also, many versioning file systems (such as the one in VMS) implicitly save a version of files each time they are changed; systems using a snapshotting approach like Windows only capture the state periodically.  All volumes must be NTFS file system formatted.

Once a snapshot is created, it must be replicated to a remote device to have efficacy as a business continuity solution.  Commands such as COPY or third party data mover software is used to provide WAN-based CDP.

Sun Microsystem’s ZFS

ZFS is capable of creating an unlimited number of snapshots of ZFS volumes using Open Solaris Service Management Facility.  The same scripting can be used with Windows DPM and NTFS.  These can be, in turn, configured as a CDP process with considerable control over granularity and frequency of snap-shotting (which surmounts some limitations of Windows DPM alone).

COPY or third party data mover software is required for WAN-based replication.

2/  Traditional Backup Software Suites Now Offer CDP Functionality

Symantec and Atempo addressed in this article from InfoStor:

“The key trend in the CDP space over the last year or so has been a movement away from standalone products to CDP as part of traditional backup/recovery suites and/or integrated into appliances that can be configured with other functionality, such as snapshots, replication, etc.”

“’The pure play CDP products have pretty much disappeared, and CDP functionality has been integrated into broader solutions, such as appliances from vendors such as FalconStor, InMage, DataCore, etc. that combine CDP with technologies such as snapshots and replication,’ says Eric Burgener, an analyst with the Taneja Group research and consulting firm. ‘And backup vendors, such as Symantec and Atempo and others, have integrated CDP for data capture into their backup products.’”

“Symantec, for example, offers two CDP products: the Windows-only Continuous Protection Server for its Backup Exec software, and Veritas NetBackup RealTime Protection for NetBackup software. RealTime is block-level CDP, based in part on technology that Symantec got via its Revivio (see Time Addressable Storage above) acquisition,  that can be used as a standalone product but is typically used in integration with NetBackup and its application-specific agents.’”

Revivio TAS technology essentially performs block level change data copy to a standalone disk target and provides the means to recover a dataset to a specific point in time (as associated with an index or catalog of change block writes), relying in the case of databases on the database software itself to perform crash recovery (i.e., to reconcile incomplete transactions and to create a viable database for use, marking or discarding those transactions that did not complete.)  Replicating the TAS environment requires the copying of both the TAS index/catalog and the primary dataset itself to an alternate location.

CA XOsoft

From the CA Website:  “CA XOSOFT REWIND CDP TECHNOLOGY, which is built into the data replication engine to provide true CDP to augment backups and snapshots. It does this by filling the in the protection “gap,” avoiding data corruption or accidental deletion between backups and snapshots. Acting in a similar manner to the rewind function on a VCR, this technology allows for granular data corruption recovery by giving you the ability to return back to seconds before the disruption event occurred. This option helps you avoid losing all data modified or created since the last backup or snapshot.”

CDP data is written to a user-designated disk target.  It can be replicated over distance using XOsoft’s replication engine. 

IBM Tivoli

Another well-known backup and storage management software suite, IBM Tivoli also provides CDP functionality.  On April 21, 2008 IBM acquired a company, FilesX, that specializes in continuous data protection and nearly instant data and application recovery software for enterprises and remote/branch offices. FilesX Xpress Restore software enables global enterprises to have full protection and easy recovery of data for mission-critical Windows applications, in the data center and in remote branch offices, such as Microsoft® Exchange, SQL, Lotus NotesTM, Oracle and SAP. The FilesX technology is now integrated and branded in the IBM Tivoli Storage Manager (TSM) family of products.

NSI Software’s Double-Take

Of Double-Take from a review in Windows Server magazine:  “Double-Take provides real-time byte-level data replication and support for point-in-time recovery (where you can restore not only the most recent replicated data but data from a specific historical point). These capabilities let Double-Take offer significant data-recovery flexibility to IT departments at SMBs and larger enterprises. Double-Take runs on various Windows OSs. Its disk-space requirements depend on the amount of data you want to replicate.”

“Using Double-Take’s wizard-driven installation, I installed the software on two servers and set up a data-replication configuration in about 30 minutes. (Double-Take doesn’t require the server pairs involved in the replication to be matched.) More complex tasks, such as configuring data protection and failover for applications such as Exchange or Microsoft SQL Server will take a bit longer, although Double-Take provides a wizard for setting up the CDP software for Exchange (the product supports Exchange Server 2003 and Exchange 2000 Server) and a detailed application note containing guidelines for setting up protection for SQL Server.”

“You manage Double-Take through a Microsoft Management Console (MMC) snap-in that gives you a consistent UI for all Double-Take tasks. For my testing, I set up a simple mirror across two servers of a 1GB data set that I configured to change fairly quickly, with roughly 10 percent of the data being modified every 10 minutes. Double-Take had no problem keeping up with that high a rate of data change on the servers connected over my LAN. To simulate a WAN connection, I told Double-Take that the two servers were connected over a T1 connection and that the software could use no more than 50 percent of the connection speed. I then configured my test to change roughly 1 percent of the data set every 10 minutes. I monitored the replication process from the management console and saw that Double-Take had no problems moving the data and throttling its bandwidth usage as I’d requested. Double-Take’s WAN data-replication feature also provides data compression.”

NeverFail Group’s NeverFail

In 2005, NeverFail Group added CDP to its NeverFail business continuity software set:

“Neverfail’s Data Rollback Module leverages Microsoft’s Volume Shadow Copy Service (VSS) technology to ensure the highest level of compatibility across the Microsoft environment. By taking shadow copies of a secondary passive server, Neverfail avoids the typical “freezing” of an application that normally occurs with VSS on a typical production server, thereby permitting more frequent shadowing than in a single server infrastructure. More frequent shadowing means greater granularity; up to 512 snapshots can be stored at any one time in an ultra-efficient format that only keeps track of differences between snapshots.”
Snapshots are replicated over a WAN using NeverFail’s replication services.

Some hardware vendors have apparently chosen to pursue the software-only avenue for delivering their CDP/Replication solutions, including HP and Hitachi Data Systems. 

HP

HP released a product in April 2006 called Continuous Information Capture Solutions (CICS) which delivered CDP functionality to capture changes in Oracle apps and Microsoft Exchange to a ‘Recovery Appliance’ (and the old data to the secondary storage device).  CICS was RecoveryOne, a CDP product from Mendocino Software, rebranded.  In early 2008, Mendocino went belly up, leaving HP looking for a new approach.

HP responded by re-branding its OmniBack Backup Software and adding functionality to leverage the file change log that Microsoft Windows provides (W2003 and newer) to create incremental backups.  Wrote one pundit:  “The more often the incremental backup is run, the less changed data there is to copy.  Theoretically this could be done once per minute.  It’s not continuous, but it is pretty close.  And, in Data Protector 6.1 at least, there is nothing extra to buy.  It is a check box in the backup specification.”

HDS

Hitachi Data Systems’ premier array platform, the USP, is a hybrid disk array back-end and virtualization controller front end. This seems to have guided the company toward a strategy of preferring hardware agnostic approaches for things like CDP and replication.  In September 2009, the company made noise about a new relationship with InMage Software stating in a press release that HDS would henceforth “will co-brand and resell the InMage Appshot technology.”

InMage solutions, according to the release, “combine continuous data protection, remote replication and application failover in a single, easy to deploy and manage platform that provides comprehensive recovery capabilities for midmarket companies. InMage supports rapid, reliable recovery for key enterprise applications including Microsoft Exchange, SQL and SharePoint as well as Oracle, MySQL, BlackBerry Server, SAP, and any Windows, Linux or UNIX file system, among others.”

InMage uses a hardware component (the Scout) and agents to instrument infrastructure for CDP and replication operations per the illustration below:

img_technolory_04

 
According to the vendor’s description of operation:  “InMage Scout collects data locally across a LAN, and then replicates that data to one or more targets that can be either local or remote. A distinguishing feature of our architecture is that it minimizes impacts on the production servers while still offering significant functionality. It can do this because most of the functionality has been off loaded to the CX - the data taps on the production servers are basically just collecting writes as they occur and sending them to the CX. This model is much more scalable and requires significantly less overhead than host-based replication products, does not impose the vendor lock-in or cost of array-based replication, and does not require a SAN or deployment in appliance pairs like most appliance options. It is a truly unique architecture in the industry that deploys rapidly, is rich in functionality, and is very flexible: we call it our ‘hybrid recovery technology’.”

No announcements have been made by HDS as to how the product will be integrated with existing HDS products.

3/  Storage Virtualization Controllers offer CDP too!

As mentioned in the InfoStor piece cited previously, storage virtualization software vendors such as Symantec, DataCore Software and FalconStor Software all offer CDP functionality.  Some hardware-based storage virtualization products also support CDP.

However the functionality is hosted, the general procedure is the same.  The controller establishes virtual disks which may be local or remote to the production environment.  Data can be written to two virtual disk targets at once in the local environment (simple mirroring), with the second virtual disk and its contents replicated across a WAN to a remote target disk.  FalconStor and DataCore have added sophisticated buffering and analysis features that can facilitate the replication of only change data from local to remote targets.  EMC Invista with Kashya technology provide comparable functionality, but it is a kluge.

FalconStor 

FalconStor CDP, built upon the FalconStor IPStor® storage virtualization platform, enhances the traditional data protection paradigm (once-a-day backups) with ongoing, real-time, disk-based backup, for fast, granular recovery to any point in time.

From the vendor’s description, “FalconStor CDP eliminates dependency on the backup window by providing continuous disk-based backup of file servers, databases, email systems, etc.” “Through FalconStor DiskSafe™ or any industry-standard host-based volume manager, FalconStor CDP continuously captures and stores block-level changes to the primary data, enabling recovery to any point in time. Data can then be securely replicated offsite for enhanced DR protection.”

The vendor argues that snapshots plus continuous journaling delivers 100% transactional integrity.

FalconStor TimeMark® snapshot technology and application-aware Snapshot Agents “provide transactionally-consistent, space-efficient delta snapshots at regularly scheduled times.” The snapshots can be mounted by storage administrators to validate the content; perform file, email, and database recovery; and accelerate backup via the FalconStor HyperTrac™ Backup Accelerator.

DataCore Software

DataCore virtualizes storage via a software layer and with functionality from a once standalone add-on product Traveller™ CPR (now integrated into its flagship SANsymphony kit).   From the vendor’s description:

“Traveller™ CPR is a continuous data protection, recovery and timeshifting platform. It combines true continuous data protection (CDP) with the power, simplicity and flexibility of virtualization. With “dial back the clock” simplicity, Traveller restores data to any prior point in time selected, without impact to production and without the overhead of software agents or other host support. The restored data volumes (known as “MakeTime volumes”) can then be assigned to servers directly from the Traveller interface with a simple mouse click, and are immediately ready for use by applications in production or offline.”

More information is coming on the status of Traveller from DataCore.

EMC Invista

EMC bought startup Kashya in 2006 to leverage its CDP technology in conjunction with EMC’s Invista storage virtualization controller platforms. Prior to the acquisition, EMC’s CDP solutions were mainly on-hardware technologies – used in connection with its Symmetrix array products, a combination of TimeFinder software (for volume cloning and point in time mirror splitting), and Symmetrix Data Recovery Facility (SRDF) software for inter-array replication in synchronous or asynchronous modes (see below).

From Network Computing (February 2007):

“RecoverPoint, which EMC acquired when it bought Kashya, supports continuous snapshots, typically seconds apart, as well as full CDP, albeit limited to SAN devices… Last October, the vendor unveiled an enterprise version of the RecoverPoint software, capable of sharing data across its Clariion devices and SAN products from other vendors. Today, EMC took the wraps off an SMB version of the software, RecoverPoint SE, which only works across Clariion devices, and, at $10,000 per array, is significantly cheaper than the $83,000 enterprise version.”

“EMC execs refused to say when they are likely to extend this level of functionality to their Celerra NAS devices, although rival NetApp is also dragging its feet in this area.”

IBM SAN Volume Controller (SVC)

In 2007, IBM enhanced its SVC storage virtualization product with support for enhanced Flash Copy services, including Incremental Flash Copy described as a “point in time” CDP approach. 

ibmsvc 

Through its partnerships with third party storage virtualization software developer/partners, IBM SVC storage can also avail CDP coverage for its virtual disks using other CDP products (aka  FalconStor, DataCore, Symantec and others).

4/  On-Hardware Implementations

NetApp

Over the years, NetApp has tended to prefer a snapshot-oriented approach to a more granular point in time CDP oriented approach.  From an InfoStor article:

“NetApp argues that snapshots are sufficient for most companies’ recovery requirements. The argument is simple: ‘Snapshots can meet most companies’ RTO and RPO requirements,’ says David Chapa, director of backup/recovery solutions marketing. ‘CDP provides finer granularity, but how granular does a company need to get? It just depends on the importance of the data and how granular you need the recovery point to be.’”

 CTO Emeritus, Dave Hitz, wrote in his blog of November 4, 2005:

“Roughly speaking, the strict technology camp consists of startups that are developing the write-logging technology (e.g. Mendocino and Revivio), while the broad business camp consists of more established companies who believe they have other technologies that address these business issues (e.g. Microsoft and NetApp)”

“I’ve been living on snapshot-based storage since fall of 1992, when James and I first switched our home directories to a NetApp filer. My experience is that hourly snapshots have always been sufficient to save my bacon. Occasionally I have wanted to see a file as it was at 2:37 p.m, instead of going back to 2:00 p.m, but that’s been rare. Usually hourly snapshots were plenty good, and way cheaper in both capacity and performance than logging every write.”

Despite its preference for snapshots, the company has tried to build a CDP solution to accommodate customers desiring such technology.  NetApp got its first CDP story with the acquisition of Alacritus in April 2005. Alacritus provided Virtual Tape Library firmware needed to create NetApp’s nearline/SATA backup product, which was then combined with partner software products from Symantec and IBM Tivoli to deliver CDP functionality. 

Despite its acquisition of CDP start-up Topio in 2006, and the short-lived product fielded by the company based on the technology, SnapMirror for Open Systems, the company now once again touts snapshot replication as its key CDP approach.  A couple of slides from a recent event extolling the virtues of the architecture of NetApp Snapshots:

 

zumap2009_netapp1

 zumap2009_netapp2

 

NetApp claims that its Snapshot based replication is also superior to competitors from the standpoint of capacity usage efficiency and reduced impact on production workload burden, per these slides from 2009:

capture5

capture6       

EMC 

While EMC’s small and midsize enterprise CDP and replication solutions are mostly software based (Replicator and Kashya-derived technology for its Invista storage virtualization controller), the company has long leveraged on-array technologies such as TimeFinder for volume cloning and Point in Time Mirror Split Imaging and SRDF for replication.  These technologies have been part of EMC’s two-hop and multi-hop mirroring strategies for over a decade. 

capture_3
capture_2

TimeFinder is a software product designed for Symmetrix class products that makes clones, snapshots and granular Point In Time split mirror images using Symmetrix disk.  Interestingly, the latest product spec sheets on this technology minimize the PIT Splitting functionality of the product, emphasizing instead snapshots and “checkpoints.”  This may reflect the widespread realization that PIT mirrors are rarely crash consistent given that most database corruption events are usually not discovered until 24 to 48 hours after they occur, mitigating the value of PIT copies.

SRDF, on the other hand, is the grand old man of replication facilities.  It sports synchronous and asynchronous operational modes and connects together only Symmetrix boxes for replication of business continuity volumes (BCVs) across LANs and WANs.

IBM

My information on on-array IBM CDP/Replication is a little sparse.  There is little I can find about it, other than their Tivoli/FilesX product for BU and their acquisition of Softek’s Transparent Data Migration Facility (TMDF), which provides much the same functionality as EMC SRDF, but with support for non-IBM gear in the replication/migration process.

Again, if you have corrections or additions to this list, or if you are a vendor of CDP and replication products and want your’s added, please comment.

Tags: , , , , , , , , , , ,

3 Responses to “Gathering CDP Info for My Tome on DR”

  1. Ernst Lopes Cardozo Says:

    Jon,

    I am, again, baffled by the storage industries ability to mix up their terminology. Continuous Data Protection, like the words say, would indicate that data is _always_ protected, not once a day, once a hour or once a second. It makes data protection part of doing a transaction. If the protection fails, the transaction fails. OK, call me purist, but I like to be able to look up words in a generic dictionary rather than have to consult a vendor-specific glossary. (Continuous: marked by uninterrupted extension in space, time, or sequence). I’m not saying that anybody needs CDP, just that the term should mean something different from Backup, Snapshots or (A)Synchronous replication. CDP is not replication, because CDP would allow me to undo a bad transaction or a data corruption by the application, malware or the OS, whereas the replication would faithfully corrupt the replica.

    So “on a routine basis” -> “on a continuous basis”.

    It seems reasonable to distinguish three forms of CDP, each offering protection against a different set of threats: local CDP keeps the history on the same volume. If something happens to the volume, your data is still gone. Replicated CDP would store the history on a different volume/storage box, but in the same premise. Remote CDP would ship the history to a remote site, so that remote backup + history can restore your data up to the transaction where things went wrong. It is a transaction log of the storage layer, rather than the file system or database.

    If CDP just means “regular data protection”, what have we gained this century?

  2. Administrator Says:

    I agree with you, Ernst. CDP should mean something different, but the marketects are blending it with snapshots (”good enough”). I suspect this is because the CDP products introduced a few years back didn’t resonate with consumers and didn’t sell well into the headwinds created by the marketing pushback of vendors of traditional backup and on-array replication.

    The same holds true for so many technologies that were bandied about in the past decade.

  3. az990tony Says:

    Hi Jon,
    Another bleg for more info?

    The FilesX product has been rename IBM Tivoli Storage Manager FastBack, including FastBack for Microsoft Exchange, and FastBack Bare Machine Recovery:
    http://www-01.ibm.com/software/tivoli/products/storage-mgr-fastback/

    SAN Volume Controller offers Vdisk Mirroring, Metro Mirror and Global Mirror, comparable to EMC SRDF options:
    http://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/sg247574.html

    IBM DS8000 series offers Metro Mirror, Global Mirror, and three-site Metro/Global Mirror:
    ftp://ftp.software.ibm.com/common/ssi/pm/sp/n/tss00241usen/TSS00241USEN.PDF
    http://www.redbooks.ibm.com/abstracts/sg246788.html
    http://www.redbooks.ibm.com/abstracts/sg246787.html

    Of course, our DS4000, DS5000, and N series have copy services:
    http://www.redbooks.ibm.com/abstracts/sg247591.html

    Let me know if you need anything else!

    Tony Pearson (IBM)

Leave a Reply

You must be logged in to post a comment.