Why Are You Copying Your Data? Part 3 – Unification

In the last two blog posts of this series, I looked at ways that we copy data for protection and ways that are about improving business. Since we are making these copies of the same data for different purposes it might be worth considering how we might use a single product to make these copies without a lot of redundant copying and storage. Each time we make a copy of the production data we are impacting the production system, minimizing the impact on production should result in a business benefit. The challenge is that the different reasons for copying data have very different requirements so a single product for these needs will have to be flexible and feature rich.

Disclosure: This post is part of my work with Cohesity.

Requirements

Purpose Data to restore Restore characteristics Special Characteristics
Backup & Recovery Recent Granular

Fast

On-site storage

Frequent restore

Disaster Recovery Recent Workflow

Full VM and application

Off-site storage

Rare recovery, mostly for testing

Compliance Many past copies Very granular Immutable copy

Almost never recovered

Reporting Latest Immediate

Application aware

Automated

Daily or weekly restore
Test and Development Latest Immediate

Data aware

Multi-VM

Restore controlled by external workflow tools

Multiple restores per day

Migration Latest Test workflow

Failover workflow

Only used for duration of migration process

Desirable Features

To consolidate all of the copying functions into a single platform we will need a few features:

  • Indexing with data and application awareness
  • Local storage for fast restores
  • Replication to another data centre or cloud for DR and migration
  • Replication to public cloud storage for compliance
  • High performance storage for DR, reporting, and test/Dev use
  • Low cost storage for compliance and archive
  • Public cloud integration for recovery
  • An API for integration
  • Pre-built integration with common IT and application process automation

I have left out the basics that we need before even considering the platform: reliable and scalable storage, integration with our hypervisor.

Implementation

Such a diverse set of requirements leads to an interesting platform design that looks a lot like a cost-efficient modern storage array, at least for the deployment into our data centre:

  • Tiered Storage

For reporting and test/dev we need performance. A relatively small amount of solid-state storage delivers performance and a lot of hard disk capacity to keep costs down.

  • Deduplication

Deduplication keeps both physical capacity and replication bandwidth under control. Long-term compliance storage could get very large without deduplication. A side benefit of deduplication is that only the metadata for each compliance point needs to be protected from modification, the deduplicated data is protected by definition.

  • Scale-Out architecture

Even with deduplication we expect growth in capacity over time, a scale-out architecture allows this growth to occur incrementally.

No matter where we deploy the platform, we want integration, simplicity, and efficiency.

  • Public cloud support

We want flexible options to use public cloud, as a destination for compliance data copies, as a location to run reporting and dev/test workloads, and as a source of business data that needs to be copied from cloud native applications.

  • APIs everywhere

In order to integrate with existing reporting and dev/test tools, the data copying platform needs to have APIs as a last resort, and pre-built integrations with common platforms. This might mean integration with Jenkins for CI/CD using copies of live data.

  • Logical copying

When we integrate all the purposes for copying data, we get a set of requirements that lead us to a virtualized copy platform. Each writable copy is a logical copy of the data rather than a full copy, changes are stored in snapshots or a deduplication system.

Could a single product satisfy all of your requirements to copy production data? Would that product deliver additional business benefits that an older backup application cannot provide?

© 2019, Alastair. All rights reserved.

About Alastair

I am a professional geek, working in IT Infrastructure. Mostly I help to communicate and educate around the use of current technology and the direction of future technologies.
This entry was posted in General. Bookmark the permalink.