In my last post about Cohesity, I showed you how to set up replication between Cohesity clusters so that you could have DR using an off-site Cohesity cluster. Today I will walk through how that actual recovery might happen. You can watch the video of the process here on YouTube. We think of DR planning as being protection against major events, floods, fire, tornados, and the like. The reality is that most DR activities are more mundane. Real disasters are infrequent, and a DR plan is mostly insurance that we pray we never need to use. Often the DR environment is also the test and development environment, and the usual recovery is to bring up an isolated copy of production. Using the DR environment for testing delivers additional value from what would otherwise be expensive idle equipment. Each test also validates that parts of our DR plan will work if a disaster does occur.
In my series of posts about copying data, I talked about Disaster Recovery (DR) as a reason to copy data between sites, particularly in a form that allows rapid recovery of a large workload. Today I will walk through the process of replicating a set of VMs from one Cohesity cluster to another. If you would prefer to see the process in a video then take a look here at my YouTube video. You can also refer to the Cohesity site for more information about Disaster Recovery and Replication. In another post and video, I will show you the recovery of those VMs.
In the last two blog posts of this series, I looked at ways that we copy data for protection and ways that are about improving business. Since we are making these copies of the same data for different purposes it might be worth considering how we might use a single product to make these copies without a lot of redundant copying and storage. Each time we make a copy of the production data we are impacting the production system, minimizing the impact on production should result in a business benefit. The challenge is that the different reasons for copying data have very different requirements so a single product for these needs will have to be flexible and feature rich.
This month, Cohesity announced a marketplace for applications that can run directly on the Cohesity cluster. This is an excellent development from their Analytics Workbench which allowed custom written reporting applications to be run on the cluster. The marketplace, part of the Helios management platform, now enables software vendors to package their applications and offer deployment onto your Cohesity cluster. The initial offerings I have seen on the marketplace include Splunk for analytics and Imanis Data which is an interesting cloud-native backup vendor. What sort of applications would be useful on the Cohesity platform and what would be a poor choice?
The Cohesity is a data management platform, so a good application to run on the cluster will be very data focused. These applications will create insight from the existing data copies on the Cohesity platform. Another characteristic of the Cohesity platform is that it uses low core count Intel CPUs, so the applications must be able to get their work done with a moderate amount of CPU time. Another characteristic is that the applications will be mostly asynchronous; you must be prepared to wait for an answer. I don’t mean that the UI will be unresponsive but that the Cohesity platform suits analytics and intelligence functions more than real-time operations. Plenty of applications will not suit deployment on the Cohesity platform; it is not a general-purpose compute platform. This is not a place to run business applications like your databases, CRM or ERP system; those belong on your primary storage with high-performance physical servers and virtual machines. The initial applications available from the Helios store are focused on reporting and analytics. I will be interested in seeing what other applications turn up in the store. I am also interested in how Cohesity customers might develop their own applications for this new platform. I understand that there are Docker containers, Kubernetes, and resource controls hidden under the covers so it should not be too hard to add customer-developed applications.
Posted inGeneral|Comments Off on Helios, Now with Apps on Your Cohesity Cluster
Do you use Software as a Service? Does your SaaS provider offer a full suite of data protection and compliance archiving? What if you choose to exit one SaaS platform and move to another, how will you fulfill your data governance requirements? Will you have to pay for the old platform just to keep access to your older archives? I feel like compliance and archiving will make SaaS platforms a new type of Hotel California, where you must still pay for the older platform even after you move to a new one. The only way I can see to avoid this is to integrate your SaaS data protection with your other data protection activities. Theresa Miller has some thoughts on the same issue in her post about the real world need for data protection with Office 365.
One of the first things that I saw in the Cohesity hardware is that it looks a lot like a hyperconverged infrastructure or scale-out software-defined storage. Multiple nodes in an enclosure, each with a mix of SSD and hard disk plus a reasonable amount of compute power. The nodes are clustered together to provide a distributed storage platform. Cohesity doesn’t seem to want to replace your SAN, but their storage is fast enough that you can run VMs from it for fast service recovery without waiting for data to copy in a restore. Once the VMs are running and service is restored, the VMs should be migrated back to your production datastores which Cohesity will do automatically. I made a short video showing this Cohesity Instant-Mass-Restore functionality in operation on my little lab. There is also a report from ESG that looks at the difference between bulk copy restores and Instant-Mass-Restore.
I wrote about Cohesity Helios back in October and this week finally started to use Helios to manage my virtual cluster. Helios is a SaaS offering for managing a collection of Cohesity clusters from a central location. For today I only have a single cluster to manage so there is a simple process to add the cluster to Helios. I posted a video of the process, showing my first time using Helios and how it was very simple to get started. I talk about IT simplification a lot, this is definitely easy to operate.
There are plenty of reasons to copy your production data, in my last blog post I talked about the reasons that were protection against things going wrong. Today I want to talk about the more positive reasons to copy data, ways that data copies can make your business more productive and profitable. All of the data copies that we made in the last post were insurance, we are winning if we never need to access those copies. The positive reasons for copying data are all about making the data accessible immediately and getting value out of that immediate access. Insurance copies of data are all about durability and metadata searchability, production copies are about performance and are often short-lived. There will be value in having a platform for managing these valuable data copies, but it will need some sophisticated capabilities to deliver business value.
Enterprise IT organizations like to have multiple copies of every piece of data, but every copy we store has a cost. It is vital that you know why you are making a copy of your data and choose the right place and product to store that copy. Traditionally we made copies of data because bad things could happen, I will focus on that in this post. There are a few different categories of ways that things can go wrong, with varying requirements for the data copies. I will also talk about the good things that can happen when you make copies of data, that will be another post. There are also considerations when you want to use a single platform for all of your data copying, that may end up being another blog post too.
Now that I am back in front of classrooms, teaching AWS courses, it is time for the Notes from the Class blog posts to return. The nature of AWS means that every class I teach will have questions that I cannot immediately answer, these posts will allow me to share the questions and answers with students.