AWS Principles: Design Services, not Servers

Most of the AWS design principles are about using the unique features and limitations of the AWS platform. With on-premises enterprise infrastructure, applications can assume that the infrastructure is perfect and will handle failures without the application knowing. The result of this enterprise infrastructure is that it is an acceptable solution to have a single server that delivers an application, features such as VMotion and vSphere HA will ensure the application is operational. On AWS, applications must expect the infrastructure to fail and must continue to deliver services when there is a failure. On AWS, there is no equivalent to VMotion or HA; your application architecture must ensure service availability. It is uncommon, but not unknown, for the EC2 service to fail for an entire AZ or to have network or storage issues that affect some or all of an AZ. If you have a single EC2 instance as a server, any of these outages means your application is offline. The best practice is to have your application spread across multiple AZs and abstracted by a multi-AZ (regional) service.

One example of building an application on AWS is to use an Elastic Load Balancer (ELB) in front of an Auto Scaling Group (ASG) of EC2 instances. The EC2 instances are spread across multiple Availability Zones (AZs) so that if one AZ has a failure, there will still be instances in other AZs. The ELB is an AWS-managed service that operates across multiple AZs in a region and provides a consistent access point for the EC2 instances across those AZs. The ELB is a loose coupling mechanism, and we will talk about those some more later. Different parts of your application might use a message queue or storage service like S3 for asynchronous loose coupling and combine and abstract multiple AZs.

You will remember the other design principle of designing scalability. A side benefit of ASGs is that you usually have to build a scale-out service with multiple EC2 instances and can now respond to varying application loads by scaling in or out as demand changes. If you created the application on AWS with a single server (EC2 instance), you must power the server off to scale it up or down by changing its CPU & Memory allocation to a new size. Scaling EC2 instances up and down causes a service outage; scaling an ASG in or out does not.

You can build services without servers, using tools like containers or Lambda functions and even the API Gateway service to make a serverless service. It isn’t that there are no servers, but that those servers are not your problem to manage. Many AWS services are serverless; even the ELB, SQS, and S3 are serverless services. They are all implemented on X86 servers, but you never see or manage those servers as consumers of the service. Designing your own services such that they are consumed without caring about individual servers is what this principle intends for you. You do always need to consider cost versus service level. You may have services that do not need 24×7 service; maybe a 15-minute outage for a new EC2 instance to be provisioned by the ASG is sufficient availability for some services. That will usually mean an ASG with only one EC2 instance; the instance can be replaced by the ASG if there is a fault. The real problem comes when you cannot simply replace a server automatically when the server contains unique data or manual configuration. A manually configured server may take hours or days to replace and return to service. This strikes at the heart of why we want automation as a core objective on AWS, along with availability, and later we will see using managed services, especially for data storage.

© 2021, Alastair. All rights reserved.

About Alastair

I am a professional geek, working in IT Infrastructure. Mostly I help to communicate and educate around the use of current technology and the direction of future technologies.
This entry was posted in General. Bookmark the permalink.