Sometimes the AWS surprises are not so much about how AWS is different, but how you design solutions differently on AWS than on-premises. One of the significant differences is that you have a near-infinite amount of resources available on AWS, while on-premises, you are always aware of a finite resource limit. On-premises your workload must fit inside those limited resources; on AWS, you can rent as much resource as your workload requires. One typical pattern on-premises is to defer reporting or bulk processing until off-peak hours, overnight when the office is empty. The office is never empty at AWS, so you might as well do that reporting or processing right away. The only time you might defer is if the spot price for the EC2 instance you want is too high.
As an example, there are plenty of problems that we solve by using a lot of compute resources to get a timely answer. On-premises we will have a limited quantity of CPU time and RAM, and these resources (servers) have a lifespan of 3-5 years, so more resources that will only be used for part of their life are expensive. On-premises it is common to consume all these limited resources for a long time to complete some complex tasks; we may have to wait hours or days for an answer. On AWS, we rent CPU time and RAM as EC2 instances and pay by the hour for what we use. On AWS, we can scale out just for the duration of the job and use maybe 50x as much resource to get an answer faster. There is no cost difference between using 5 EC2 instances for 100 hours and 250 EC2 instances for two hours, so scaling out massively is an option.
Other near-infinite resources include storage, networking, and even application services. The Simple Storage Service (S3) allows you unlimited storage capacity and only charges you for what you actually store. The VPC network and it’s supporting features such as ELB provide colossal capacity that is available on-demand, and you are billed for consumption, not capacity. Even application services such as the Simple Queue Service (SQS) offer near-unlimited messages per second in a queue and only charge you for the transactions on that queue. There are a lot of AWS services that allow you to draw from a nearly limitless pool of resources and only pay for the resources that you use.
Capacity Is Never Infinite
One caveat is that while AWS has near infinite capacity, there is always a finite amount, and, in some situations, that limited amount may not be as large as you might hope. When you start deploying unusual and new EC2 instance types, and particularly when you use them in their largest configurations, you may get Insufficient Compute Capacity Errors (ICCE, pronounced ice). Remember that each EC2 family and generation runs on its own dedicated physical servers, M5 instances only run on M5 servers, which in turn only run M5 instances. The larger the size within the family and the more instances you request, the more previously unused capacity is required. So, if you decide to deploy a cluster of six X1e.32XLarge across three availability zones, you may find that one of those AZs does not have two whole X1e hosts to dedicate to your cluster immediately. Hopefully, you have a good relationship with your local AWS team and can get this information before it causes you a problem. They may suggest that you use smaller instances and more of them, or that you will have a better result with a different region or a different EC2 family.
If you had on-demand access to a virtually infinite amount of computing resources, how would your IT and business operate differently? On AWS, resources are available, and you pay for what you use each month. To get the best out of AWS, you should deploy the resources you need as you need them, and cast off the implicit implication of purchased on-premises IT.