Keeping the HSM Dream Alive

Way back in the 1990s I was involved in managing large numbers of Windows file servers, as a central repository of business data. These file servers grew and grew over time, more and more files stored. Many organizations now have years and years of files stored on file servers and high-performance NAS appliances. Over time the knowledge of the value of these files is diluted, but the fear that something important may get lost never fades. IT teams are left as the holders of this business data and must treat every file as if some manager or regulator may demand access at any moment. Back in the ’90s, there was also a dream of Hierarchical Storage Management (HSM) which allowed data to move to lower-cost storage when it was not frequently accessed, freeing space on the expensive and fast storage for more frequently accessed data. At the time, there was no built-in support for data mobility in operating systems, so each HSM product had its own custom file-system driver to redirect access to migrated files.

These reminiscences were triggered by a quick look at the Cohesity Data Migration feature in the latest platform builds as part of my work with Cohesity. I will take a more detailed look at Data Migration after VMworld.

Back in the 1990s, those file servers were physical machines with local storage; now they are VMs with a SAN or HCI for storage, probably an all-flash platform. The cost-per-gigabyte to store infrequently accessed files on these VMs probably exceeds the value of the files, so migration to a lower cost tier could bring a significant financial benefit. Modern operating systems support symbolic links to network shares, so there is no need for custom file system drivers. The new tier storage platform should deliver capacity efficiency with low-cost storage such as large hard drives, allow simple scaling as data volumes increase, deliver highly available and durable network storage, and use modern CPU capabilities to offload deduplication. The 1990’s vision of HSM can now be delivered in a useful way to enable data migration from expensive high-performance storage to more cost-effective storage when access is less frequent.

The most significant news risk is that after tiering to external storage, your data is now spread across multiple platforms but tightly linked. Your full file share is not contained on one server; it now requires both the file server and the cost-effective external platform. Keep in mind that this cost-effective storage is also a file server, maybe the fileserver VM should be replaced with this storage? Where the fileserver is just a repository for shared files, the cost-effective storage platform may provide sufficient storage, especially if there is an option to use some flash to accelerate the hard-disk based platform. If the fileserver is also an application server, so example a document management application, then you need the operating system for the application and tiering files to an external platform can reduce the use of expensive SAN and flash.

© 2019, Alastair. All rights reserved.

About Alastair

I am a professional geek, working in IT Infrastructure. Mostly I help to communicate and educate around the use of current technology and the direction of future technologies.
This entry was posted in General. Bookmark the permalink.