Hello from the CTO Advisor

This year (2024) is shaping up to be a very exciting time. I have joined a few of my friends at the Futurum Group. You may recognize that name as where Keith Townsend took the CTO Advisor, and Stephen Foskett took Tech Field Day. It is also the home of the Evaluator Group and a few other smaller organizations that are much greater than the sum of their parts. But back to me, I am joining Keith as part of the CTO Advisor and working with the Signal65 Labs team. You will likely see me interviewing industry executives at conferences for the CTO Advisor and doing hands-on projects with interesting products and technologies.

I will continue to look at products through a lens of the complexity of real-world deployment: multiple products and vendors that need to work together, shortages of resources, challenges with people and internal politics. I look forward to having more science from the Signal65 Labs team; they have great experience in product evaluation and performance testing.

At this stage, neither vBrownBag nor Build Day Live are involved; they remain independent and community-focused. I think there will be some great opportunities for community technical expertise to be a foundation for educating others about these technical topics. As we have previously done, having vendors pay for creating the content allows you to consume and learn for free.

Posted in General | Comments Off on Hello from the CTO Advisor

Containers as Application Services?

I maintain that the real reason for AWS’s success is that it is a developer enablement platform. Many AWS services exist to enable developers to write code specific to the business problem they are trying to solve. Do you need a message queue? There is a service for that. Do you need an event broker? Guess what? There is a service for that too. By using these pre-built services, which are consumed through an API, it is easy to spend time writing the code specific to your business. Without the services, you spend more time writing code that implements technologies rather than business processes.

Continue reading
Posted in General | Comments Off on Containers as Application Services?

Thousands of Containers Without Kubernetes?

Often marketing departments and analysts like buzzwords and want to hear every vendor using the popular buzzwords, whether they represent value or not. Kubernetes is a fantastic container orchestrator with its origins in Google and a footprint in every public and almost every private cloud in the world. But like any tool, Kubernetes is not the solution to all your problems, not even every possible container orchestration problem. Kubernetes is designed to orchestrate groups of containers within a single physical location. Each physical location needs its own Kubernetes cluster, often with three nodes as a control plane and more nodes for the workload. There are tools such as K3S that consolidate all the roles onto a single host, but a local Kubernetes cluster is still required to manage containers. Requiring a cluster per site isn’t a big issue at cloud scale, with many containers running at a relatively small number of locations. Managing a dozen clusters that each run thousands of containers is where Kubernetes excels.

Continue reading
Posted in General | Comments Off on Thousands of Containers Without Kubernetes?

Mako Networks, I Haven’t Heard That Name In Years

I recognized the name Mako Networks as an innovative local company in New Zealand, although it had been a while since I heard the name. It turns out, the New Zealand company is now a Chicago company, but the development team remains in New Zealand. Mako Networks is one of the five companies presenting at Edge Field Day this week. The original Mako product was a centrally managed Internet and VPN router and that is still an element of the product. Current Make Networks products seem to be designed to accommodate the complexity of working in large organizations with multiple teams and external providers and very strict requirements around security and segregation of duties. Join me for the live stream of the Mako Networks session as well as all of the Edge Field Day sessions.

Posted in General | Comments Off on Mako Networks, I Haven’t Heard That Name In Years

The Scale Computing Edge

This week I will be at Edge Field Day in San Francisco, I’m looking forward to being back in person with my old and new Tech Field Day friends. Speaking of things that are both old and new, this will be my first proper chance to talk to Scale Computing about their complete transformation from an HCI provider to small businesses into a provider of edge computing for massive companies. The transition has been amazing. Small businesses with small budgets and generalist IT staff loved Scale Computing’s platform because it was cost-effective and mostly self-managing. Unsurprisingly, many edge use cases love a cost-effective platform that is largely self-managing, since they have no IT staff at the edge locations. The transformation has really been around how you scale the deployment and management of huge numbers of these cost-effective clusters. The last time I had an open discussion with Scale Computing (pre-covid, it has certainly been a while), they were getting the automated mass-deployment tooling together so I’m looking forward to hearing and seeing how it works now. I’m hoping to also hear about more application-centric tools as a lot of edge-compute deployments are driven by specific applications and the need to deliver those same applications to hundreds or thousands of locations. Join me for the live stream of the Scale Computing session as well as all of the Edge Field Day sessions.

Posted in General | Comments Off on The Scale Computing Edge

What is your edge?

Words mean things, so defining what I mean by some words is important. I talked with Charles Uneze about definitions of edge computing and feel that it is important to have a framework for discussing edge and its older cousin, the Internet of Things (IoT).  Our discussion was triggered by the upcoming Edge Field Day event.

Other Edges

I do want to limit this discussion to edge compute because edge means other things to other audiences. For example, edge networks have been a part of network design for many years, differentiating the edge where user devices connect from the core where centralized computing occurs. There is also quite a lot of overlap between my far-edge definition and the Remote Office/Branch Office (ROBO) category we have had for a while. The major difference with the edge is that IT staff almost never visit edge locations, whereas ROBO locations might get annual or quarterly visits from IT staff for updates and upgrades.

I tend to think of edge compute to handle applications that do not suit centralization because the network characteristics would impact the application if it were centralized. For example, network latency might mean an autonomous car took too long to recognize a hazard and crashed. Even worse, what if there was a network black spot & the car could not talk to the cloud service? There are also a lot of use cases where the cost of data transfer is high, so local processing can reduce the amount of data transferred and, therefore the overall cost. This distributed processing means that edge deployments usually comprise many almost identical deployments. Often the data generated or processed at edge locations will be persisted in a cloud or on-premises location to get more long-term value. The processing at the edge location is to make immediate decisions and to reduce the amount of data sent back to the central location. Even within edge compute, there is a lot of variation. I like the idea of dividing into near-edge, and far-edge compute, both of which pair well with IoT.

Near-Edge

Near-edge is a data center with server racks, environmental control, physical security, and power protection. It isn’t your data center; it belongs to a service provider and has racks of servers belonging to other clients of the provider. It may be that the service provider is a telco or a hosting provider. Often the only IT staff that visit near-edge locations are the provider’s staff, they perform any rack and stack operations for you, and your IT staff may never enter these data centers. A near-edge location provides a general-purpose computing platform, often VMs and, increasingly, Kubernetes in VMs. Capacity planning for near-edge locations is usually trend based, with additional servers added if there is insufficient capacity.

Far-Edge

Far-edge is not a data center; it can be hot and dusty; it might be in a public place like a photo printing kiosk, retail store counter, factory floor, or delivery truck. The computers at far-edge locations are smaller and more rugged to suit their location, with a hardware configuration that is dictated by the applications required at the location. A far-edge location might run a combination of VMs with installed software and containers. However, the resource requirements for Kubernetes might outweigh its value, so simple container deployment is more common at far-edge. The hardware at far-edge locations is expensive to upgrade or replace. Application updates or new applications often must fit inside the existing hardware configuration at the far-edge location. If there are staff at the far-edge location, they are your staff, but they are not IT staff; they might be a delivery driver or the oil drilling rig hands. When you have hundreds, or thousands, of far-edge locations, your IT staff should never need to go to these locations.

IoT

The Internet of Things (IoT) is a device, usually running a single application and generating or receiving data without human intervention. An air quality sensor, a temperature sensor, and a digital sign are all IoT devices. Often several of these IoT devices will generate data which is then consolidated and analyzed at an edge location. Because the device runs only a single application, the hardware configuration is driven by that application. An IoT device might be a microcontroller or an embedded PC with relatively limited compute power but also requiring little power and cooling.

Edge Population

Edge is highly distributed, with many sites and many devices. For example, an enterprise organization might have ten major on-premises data centers or be in a dozen public cloud locations. The same organization may have a hundred near-edge locations and several thousand far-edge locations with tens of thousands of IoT devices across these locations.

Simply calling something edge with no qualification isn’t very helpful. To understand solutions for edge requirements, it is important to recognize that edge isn’t a single thing, and more description and detail will be required. Hopefully, these definitions of near-edge, far-edge, and IoT will help frame a better discussion.

Posted in General | Comments Off on What is your edge?

VVols, Oh VVols, wherefore art thou VVols?

Some technologies take over the (IT infrastructure) world, some sink without a trace, and others find a niche where they fit a requirement. I suspect VMware’s VVols (Virtual Volumes) fall into the last category. VVols was released as a feature of vSphere 6.0 in March 2015 and updated to version 2.0 with vSphere 6.5 in late 2016. The primary functionality of VVOLS is to allow a block storage array visibility and control of the storage presented to VMs. Rather than the usual VMFS, where the array only sees a datastore but cannot tell what VMs are using that datastore. Less storage management in the ESXi hypervisor and more in your storage array. The assumption is that the storage array, or storage team, is good at managing storage capabilities and can provide a better service to the VMs than the vSphere hypervisor’s native capabilities. For example, storage array-based snapshots rather than vSphere snapshots or storage replication at the VM level rather than the datastore. An interesting element is an ability to control performance on a per-disk level to the VM from the array rather than layering per-datastore performance management from the array with per-disk performance management from the hypervisor. VVOLs only applies to block storage (iSCSI, Fibre Channel, and NVMEoF) and not to NFS-based storage, where the NFS server already knows about the individual VM disks because they are just files on the NFS share. NFS-based arrays know about the individual VM disks but can also benefit from VVOLs to offload advanced storage features to the NFS array. Thanks, Ben for the correction.

The Pure Storage presentation inspired my thinking about VVOLS at the Tech Field Day Extra at VMworld US 2022. It had been quite a while since I heard a lot of talk about VVOLs, so I wondered whether it was still a thing. A quick Google search shows that storage vendors are still talking about VVOLs and vSphere 7 brought some updates to VVOLs, so there must be customers using and benefitting from VVOLs; it hasn’t sunk without a trace. But why didn’t VVOLs take over the world? I suspect it is a combination of easily used features in vSphere and plentiful performance from all-flash arrays. After all, the infrastructure needs only be good enough not to limit the applications that it hosts. If your applications don’t demand more performance and capabilities than vSphere, VMFS, and flash together can deliver, then the simplest solution will provide the best benefit. The place where VVOLs have value is where VMFS limits the application. It might be that vSphere snapshots cause VM stuns, which affect application performance. These stuns are a part of vSphere’s snapshot behavior and can be a big issue when snapshots frequently happen during working hours. It might be a highly critical application that requires very low and stable disk latency, such as real-time commodity trading. These are use cases where the application requirements demand specific storage capabilities. Usually, these are business-critical applications at the core of the business.

Do you need VVOLs in your vSphere environment? Ask yourself a simple question, does VMFS simplify or complicate your storage design? If VMFS simplifies, you probably spend little time thinking about storage for your VMs and have at least half a dozen VMs for every datastore. If VMFS complicates your storage design, you probably have datastores dedicated to specific VMs or applications and spend significant time tuning the datastore and LUN configurations. If VMFS simplifies your storage, keep using VMFS and not have to spend too much time on storage. If VMFS complicates, then looks loosely at VVOLs, it will probably be easier to build complicated storage configurations with VVOLs than using VMFS.

Posted in General | Comments Off on VVols, Oh VVols, wherefore art thou VVols?

Diversity? How about neurodiversity?

I’ve always used this blog to document and share what I learned. Lately, I have been learning about myself and my neurodiverse mind, so I plan to share some of this new learning. First off, I am not a psychiatrist, and I have not seen any professional or had any diagnosis. I am going to talk entirely about my experience. If some of this resonates with you, maybe spend some time researching neurodiversity because understanding how your brain works can unlock the super-power part of not being typical. Being neurodiverse hasn’t stood in my way. I have been married to my (first) wife for nearly 30 years, we have two adult children who live independently but communicate and hug us when we see them. I have a fun and successful career where I go to amazing places and do amazing things with awesome people.

Continue reading
Posted in General | Comments Off on Diversity? How about neurodiversity?

Multi-Cloud Mobility – MinIO

Cloud-native applications usually use cloud-native storage, usually a combination of databases and object storage. Some of the agility of cloud-native application development comes from separating the persistence (storage) from the compute. Applications can be rapidly developed using DevOps methodologies while the valuable persistent data remains in the storage services. But what about the portability of those storage services? If you are not all-in with one cloud, you might want a persistence layer that you can use across clouds and on-premises. This is the challenge with multi-cloud, each cloud has its own standard services, and there is little interoperability between those standard services because each cloud provider wants to host all your IT.

MinIO can help you with multi-cloud object storage, providing S3 compatible storage anywhere you run a Kubernetes cluster, on-premises, or on almost any public cloud platform. The S3 compatibility is not simply about the API to access objects. MinIO has S3 features such as Lifecycle policies, versioning, object lock, and replication. MinIO can replicate buckets and objects from cloud-based MinIO buckets to on-premises or other cloud locations. Object storage tends to suit asynchronous replication; most of the time, objects are written once and read many (WORM) times, although MinIO offers synchronous replication for different use-cases. All your MinIO clusters are managed through a centralized console and API.

MinIO does not provide object storage with the lowest cost per GB. The focus is on performance and solving data management problems for large consumers of object storage. I saw MinIO present at Tech Field Day 25 and earlier at Cloud Field Day 11. At last year’s Cloud Field Day, MinIO also talked about having an interface for RocksDB, which multiple types of database engines can use. Using the same underlying platform for both unstructured data (S3) and structured data (RocksDB) might allow a unified persistence tier to enable multi-cloud deployment of cloud-native applications.

Posted in General | Comments Off on Multi-Cloud Mobility – MinIO

AWS Principles: Use caching

The design principle to use caching is not simply an AWS principle, it is a common application design principle. A cache is a temporary storage location for a small amount of data that improves application performance. Sometimes the cache is distributed around the world to be close to users, then it might be called a content delivery network. Other times the cache is simply extra memory in your web or application servers that holds some status information about currently active users. The idea that a cache is temporary is important, it is not the persistent storage location for the data. If the cache gets lost, the data in the cache can be re-created from a persistent location. The idea that the cache is not the definitive source is also important, data in the cache represents a copy of the persistent data at some point in the past. Some data can stay in the cache for a long time, the temperature recorded at 10 am yesterday will never change. The current temperature will change, so the current temperature shouldn’t be kept in cache for long, it should have a short Time To Live (TTL).

There are a few trigger points for considering adding a cache, usually centered around needing more application performance for transactional (rather than analytics) workloads. If increasing the database or other persistence tier performance tier seems inefficient, you might feel that you are not getting consistent value for money, then caching might be a good option. This can also be a trigger for considering a different database for a subset of the data, I mentioned this in a previous principle. As an infrastructure person, I am used to providing transparent caches, where the application code is unaware of the cache. But software developers often use explicit caches, where the application code makes choices about what data to place in the cache and when to update or remove cached data. On AWS, the ElastiCache service provides RAM-based caching which developers can choose to use within their application. Because it is an explicit cache, the application developer chooses what data to cache, whether to write to the cache on database updates or only on reads. There is a lot of developer effort to get the most out of ElastiCache, but the potential performance improvement is huge.

Caching is an important tool for improving application performance, everywhere from the end access device (user’s laptop or phone), through the application servers and to the persistent storage at the back. Efficient use of caching does require good design and the more application awareness you bring to that design the more efficiently you can use the expensive cache. Allocating excess RAM to application servers is a simple but inefficient way to provide caching, particularly for applications that you cannot get rewritten.

Posted in General | Comments Off on AWS Principles: Use caching