Is it just me that gets annoyed when category definitions are arbitrary and fail to match up to real business needs? One example is Gartner’s All-Flash Array (AFA) storage analysis. Any product that can be either AFA or hybrid is excluded, so vendors make unique product IDs that are really just an all-flash configuration of a hybrid array. Gartner’s definition of AFA gets in the way of customers looking for a set of benefits. I have come to realize that I have made the same mistake about HyperConverged Infrastructure (HCI) as a category. The realization arrived as I took part in Tech Field Day 16, particularly this presentation by Adam Carter. Naturally, my standard TFD disclaimer applies. HCI is not really about putting clustered storage inside a bunch of hypervisor hosts; it is far more about the simplicity of operating an environment designed purely to run VMs. There is a range of vendors with products that make it easy to deploy and manage a virtualization platform which is what HCI is really about. To me, the big surprise is that VMware does not have a general-purpose deployment tool, even for a basic vSphere cluster.
Nutanix and VMware argue about whether VSAN is the leader or Nutanix is the leader, but both are HCI products that put storage inside hypervisor nodes. Both have a storage cluster that is also a hypervisor cluster. While both have VM centric management, I think that Nutanix wins on ease of deployment. Dell’s VxRail is pretty simple to deploy too, but VSAN deployment is not always that easy. These HCI products also have a scale-out model, where the cluster can be easily expanded in small units as demand grows. Most of the other HCI products I have seen also have the same model of local storage clustered, SimpliVity, Maxta, Pivot3 all work in this way. We did see some products along the way that were “HCI washed” where existing products were mashed together to match the tick boxes for an HCI product. These products didn’t deliver the benefits of simplified deployment and management and seem to have fallen by the wayside.
The latest HCI product that isn’t a conventional model is the one from NetApp. The SolidFire scale out AFA is combined with some hypervisor hosts that do not have local storage. In this model, the storage cluster is on separate hardware to the hypervisor cluster. Both clusters are made up of x86 servers sleds in the ‘4 servers in 2U’ enclosure that we have become accustomed to seeing. Some nodes have lots of flash storage and run the SolidFire Element OS while other nodes have only a boot drive and run vSphere. So why is this a HyperConverged Infrastructure? There is a simple wizard-based deployment process, and all management is VM centric. Does it matter how those benefits are delivered if customers receive the benefits? This HCI also scales out as demand scales. Unlike conventional HCI storage and compute scale-out independently by adding different node types to the enclosures. There is no need to buy extra hypervisor licenses to accommodate more storage and no need to buy additional storage to get more compute capacity. The other nice thing is that physical resources are dedicated to storage, making it easier to ensure storage performance and service levels.
Of course, there are trade-offs in every solution. This NetApp HCI has a five-node minimum size; most HCI solutions are a three-node minimum. The NetApp solution is not for smaller businesses or ROBO deployment; this is HCI for enterprise data centers. Another trade-off is that all IO must go over the network to the storage cluster, there is no data locality to allow some IO to happen inside the hypervisor host. A quick, back of the envelope, calculation suggests that the worst-case result could be 40% more storage network load for the hypervisor hosts. I would be interested to know whether this increase would cause any issues for real deployments.
While we think about gaining the benefits of HCI without combining our storage and hypervisor, maybe advances in management can deliver the same benefits with a more conventional architecture. We saw VM based storage management with Tintri, so a VM-centric view doesn’t need HCI. We also see VM-centric, policy-based, management with VMware’s VVols being adopted in other storage arrays like Pure Storage. Either of these approaches allows very simple ongoing management of VMs, and many include integrated data protection as part of the policy set. What does seem to be missing is simple hypervisor deployment. VMware seems to have stopped simplifying deploying and configuring ESXi, leaving a lot of requirement for external tools scripting to simplify deployment. I would love to have a deployment appliance from VMware which allowed me to set standards for ESXi deployment and have new servers built to those standards automatically.
HyperConverged Infrastructure is about having a simple platform that is dedicated to running VMs. Deployment and expansion of the platform should be simple and ideally wizard driven. Ongoing management should be policy based, and VM centered. Extra points are earned by simplifying data protection and any other operational activities such as updating the platform.
© 2018, Alastair. All rights reserved.