I like having consistent management interfaces and having a single operational model across as much of my IT estate as possible. I don’t like point solutions that function or are managed differently; they add up to more problems. With this in mind, I would like to see far deeper network integration between AWS and VMware Cloud on AWS (VMC) even though I know why I won’t get this integration for a while. At Cloud Field Day 7, we had two sessions that focussed on network connectivity between AWS (AWS presentation) and VMC (VMware presentation); neither said it works the same as everything else they offer.
What do I want?
I want the networking on VMC to be equivalent to the networking in AWS Virtual Private Could (VPC). The essential equivalences for me are Direct Connect and Transit Gateway support. I want to be able to have a VIF on my Direct Connect that corresponds to a portgroup on my VMC. Then I can use the same routing mechanisms to connect VMC to on-premises and VPCs that I can use to connect VPCs. This Direct Connect (DX) method will help out companies whose hybrid cloud connectivity strategy was defined before Transit Gateway was released in late 2018. Speaking of Transit Gateway, I want to be able to attach a portgroup on VMC to my Transit Gateway (TGW), just like I connect a VPC to TGW. TGW support will be great for connecting vSphere based applications to AWS VPC based applications and will suit companies who have built their hybrid connectivity around TGW. TGW already supports Direct Connect, so I probably don’t need both for one customer.
I’m not in such a rush for VPC peering support. VPC peering is a colossal PITA when you have more than one or two VPCs, which is why Transit Gateway is so awesome. If you have sufficiently few VPCs that peering is workable, then the existing VMC to VPC connectivity will probably also be viable for you.
Why connect portgroups, not SDDCs?
I have asked to connect portgroups to VIFs and TGW, rather than whole vSphere clusters (SDDCs). I expect an SDDC to run multiple applications and quite possibly applications with different security requirements. Having separate control of routing at the portgroup level allows that security separation to flow into the AWS VPC networking subnets. I want it this way round because, at the moment, I am thinking AWS first. What if I was vSphere first? Then NSX is my primary network control, and I still want to control routing to AWS based on portgroups connecting to VPC subnets. Either way, a VPC is not equivalent to an SDDC. The SDDC might fulfill roles for multiple VPCs, so I need finer-grained control than the entire SDDC.
What is so hard?
The fundamental issue is an impedance mismatch between the AWS SDN and the VMware SDN; they do not share the common heritage that you would get with two enterprise SDNs. VMware’s NSX is fundamentally a single-tenant platform designed to run infrastructure for a single enterprise organization. The AWS SDN is older and is designed to run a multi-tenant infrastructure for multiple large organizations with no coordination between the tenants. As an example, on NSX, routing decisions can be made based on source and destination IP addresses. On AWS, the same routing infrastructure is used for multiple VPCs and tenants, so routing decisions require a VPC ID as well as the IP addresses. Bridging the gap between these two approaches to SDN is hard engineering work. The task is then overlaid with the issue that AWS and VMware are competitors as well as partners; neither wants the other to be more successful in the collaboration than themselves.
I hope we get VMware on AWS networking fully integrated with the AWS VPC network as an equal to VPC. For now, we are stuck with a collection of possible solutions involving virtual interfaces, VPNs, and proprietary network bridging (HCX), which is not integrated into a holistic hybrid cloud network.
© 2020, Alastair. All rights reserved.