Demitasse

AWS Design Principles – Use Disposable Resources

Posted on January 8, 2021 by Alastair

Recycling is a good strategy for the physical world and also for your AWS resources. When you have an automated process to deploy parts of your application, you can often use that same automation to rebuild broken pieces rather than troubleshooting the failure. Naturally, there are parts of your applications that cannot simply be destroyed and replaced; there is always valuable persistent data somewhere. The design skill separates that data from the rest of your application components and handles the persistent and disposable parts differently. Later I will look at using managed services for that persistent data; now, we will look at the disposable portion.

Continue reading →

Posted in General | Comments Off

Vendor Briefing – VirtualMetric

Posted on January 5, 2021 by Alastair

The company name VirtualMetric might lead you to believe that the product is all about the performance numbers. While the product definitely has plenty of metrics, I don’t think that is the biggest differentiator. What stood out for me in the demo was how much more information was gathered and available in the highly configurable console. There was both slow-changing information, such as installed applications, patches, and operating system configuration, as well as faster-changing details such as running processes, network connections, and performance metrics. The main dashboard views are infinitely customizable; the nicest customization is assembling a dashboard from any of the information collected and sharing that with other teams working on the same issue. There is also a mechanism to cycle through a series of dashboards on a timer, ideal for a large display in an operations center. Data collection is agentless; the VirtualMetric server pulls data across your network periodically. Static data such as installed applications is retrieved daily, log information more frequently, and select performance metrics as often as every second. Agentless data collection means that there is no requirement to deploy anything new onto your servers but limit data collection to what is published by the server’s operating system. The second challenge with agentless monitoring is that it has a higher network load, so different retrieval intervals for different data are essential.

I like that VirtualMetric provides a single consolidated location to find a lot of information about my infrastructure and applications. Because there is so much data, having customizable dashboards is good. As a customer, be sure to have subject matter experts author dashboards that enable less specialist staff to identify and resolve issues. I would like the product to do more with that information, some analytics that proactively identifies possible problems and provides me recommended remediations. Ideally, I would like the product to allow me to simply accept the remediation, probably with integration into the corporate change management system. VirtualMetric has remediation actions on its roadmap, although that is over a year away. Right now, the product is read-only and will not make changes to the monitored network.

Posted in General | Comments Off

AWS Design – Automate your environment

Posted on December 7, 2020 by Alastair

In a long distant former life, I looked after a farm of around a hundred Citrix servers. It was so long ago that they were physical servers, and we built or rebuilt servers following dozens of pages of written instructions. You can imagine that there were plenty of helpdesk calls for faults in the build of individual servers. This environment typifies the “handcrafted perfection” that Enterprise IT operations teams had to build. Even back in 2001, I created an automated build process of these servers to avoid manual builds. Manual processes work at the speed of humans, and they are full of human errors. To work faster and smarter, we need methods that are protected from human errors. Operational processes must be executed precisely the same every time. With manual processes, you must get it right every time. With automation, you only need to get it right once. Automating build and deployment is an excellent start as it will deliver consistent and reliable infrastructure on demand.

Ideally, infrastructure automation should be more like software development and use a declarative configuration service to implement the specification in a design file. The design file is the source code for the built environment and is stored in version control, just like the source code for the application. Once you can deploy an environment automatically, the environment is automatically recreatable. You might recreate an environment that broke rather than troubleshoot the problem. You might recreate rather than restore from backups. You should also create a copy for testing, for both changes to the environment and changes to the deployed software.

With deployment automated, it becomes easy to build an environment for testing, then dispose of that environment when the testing is complete. I don’t mean at the end of the project, but at the end of the test. When we want to implement Continuous Integration and Continuous Delivery or Deployment (CI/CD), we need to run tests for each code change made by a developer. Without build automation, these tests simply cannot happen at the pace demanded by CI/CD methodologies. There is a strong argument to be made for keeping the infrastructure design file alongside the application source code, a single source location for both the application and the infrastructure that the application requires.

Infrastructure automation aims to increase the consistency and velocity of operations. Once the infrastructure lifecycle is automated, you unlock the ability to automate application lifecycles. With the ability to innovate in applications safely, business agility is easier and more reliable, providing greater value for overall IT spend.

Posted in General | Comments Off

Build Day TV – AWS Networking Fundamentals

Posted on October 24, 2020 by Alastair

If you are just starting out with AWS, you might find the networking a little different from what you are used to on-premises. Take a look at this video series we recently ran on Build Day Live; it was the first series of Build Day TV episodes. Most of the episodes are in two parts, a theoretical video, and a hands-on demonstration.

AWS VPC networking Basics Series

Build Day TV is regularly published video episodes, usually a series of episodes on a single topic. The second series was our coverage of the Oracle Cloud VMware Service (OCVS). The latest series is about the VMware SD-WAN solution, formerly known as Velocloud.

Posted in General | Comments Off

AWS Design – Enable Scalability

Posted on September 22, 2020 by Alastair

One of the defining capabilities of public cloud is elasticity, the ability to use more or less resource over time to meet the load requirements of your application. When your application is quiet, you should consume and pay for fewer resources than when your application is busy. Not all AWS services have scalability built-in, many services require that you manage your own scalability. Managed services like Lambda and Fargate, mange capacity for you, delivering the resources that are required for your workload. More lightly managed services, such as EC2 and RDS, leave scalability up to you, although they may provide tooling like autoscaling you can use.

Continue reading →

Posted in General | Comments Off

Ten Design Principles on AWS

Posted on September 8, 2020 by Alastair

Having previously looked at some surprises I discovered as I learned about AWS, I’m going to take a look at some of the basic architectural design principles on AWS. As in the last series, there will be blog posts for each principle that go into some basic details. Here are the ten principles:

Enable scalability: What happens if demand increases? Or doesn’t increase? What if demand goes up and down over time?
Automate your environment: Computers are good at doing things the same every time, humans are not
Use disposable resources: “Everything fails, all the time” Werner Vogels. Replace broken things with brand new things, rather than spend a lot of time fixing them.
Loosely couple your components: When one element of your application changes or has an issue, the rest of the application should still work.
Design services, not servers: An EC2 instance should not be a single point of failure. Use several instances and a load balancer or a queue.
Choose the right database solutions: I don’t mean Microsoft SQL Server vs Oracle. I mean use the right database for the data you need to store, some will be better in non-relational databases.
Understand your single points of failure: There are always SPOFs, make sure you know where they are and try to eliminate as many as possible.
Optimize for cost: Your AWS bill will arrive every month & you will pay for what you use. Make sure you are getting value for every dollar spent on that AWS bill.
Use caching: Your data is not all of the same value or location, nor are resources of the same cost. Caching uses small amounts of resource that is fast or close.
Secure your infrastructure at every layer: “Dance like nobody’s watching, encrypt like everyone is” Werner Vogels. By now, we should all understand that defense in depth is the only viable strategy.

Posted in General | Comments Off

AWS Surprises – AWS Has Virtually Infinite Resources

Posted on July 15, 2020 by Alastair

Sometimes the AWS surprises are not so much about how AWS is different, but how you design solutions differently on AWS than on-premises. One of the significant differences is that you have a near-infinite amount of resources available on AWS, while on-premises, you are always aware of a finite resource limit. On-premises your workload must fit inside those limited resources; on AWS, you can rent as much resource as your workload requires. One typical pattern on-premises is to defer reporting or bulk processing until off-peak hours, overnight when the office is empty. The office is never empty at AWS, so you might as well do that reporting or processing right away. The only time you might defer is if the spot price for the EC2 instance you want is too high.

As an example, there are plenty of problems that we solve by using a lot of compute resources to get a timely answer. On-premises we will have a limited quantity of CPU time and RAM, and these resources (servers) have a lifespan of 3-5 years, so more resources that will only be used for part of their life are expensive. On-premises it is common to consume all these limited resources for a long time to complete some complex tasks; we may have to wait hours or days for an answer. On AWS, we rent CPU time and RAM as EC2 instances and pay by the hour for what we use. On AWS, we can scale out just for the duration of the job and use maybe 50x as much resource to get an answer faster. There is no cost difference between using 5 EC2 instances for 100 hours and 250 EC2 instances for two hours, so scaling out massively is an option.

Other near-infinite resources include storage, networking, and even application services. The Simple Storage Service (S3) allows you unlimited storage capacity and only charges you for what you actually store. The VPC network and it’s supporting features such as ELB provide colossal capacity that is available on-demand, and you are billed for consumption, not capacity. Even application services such as the Simple Queue Service (SQS) offer near-unlimited messages per second in a queue and only charge you for the transactions on that queue. There are a lot of AWS services that allow you to draw from a nearly limitless pool of resources and only pay for the resources that you use.

Capacity Is Never Infinite

One caveat is that while AWS has near infinite capacity, there is always a finite amount, and, in some situations, that limited amount may not be as large as you might hope. When you start deploying unusual and new EC2 instance types, and particularly when you use them in their largest configurations, you may get Insufficient Compute Capacity Errors (ICCE, pronounced ice). Remember that each EC2 family and generation runs on its own dedicated physical servers, M5 instances only run on M5 servers, which in turn only run M5 instances. The larger the size within the family and the more instances you request, the more previously unused capacity is required. So, if you decide to deploy a cluster of six X1e.32XLarge across three availability zones, you may find that one of those AZs does not have two whole X1e hosts to dedicate to your cluster immediately. Hopefully, you have a good relationship with your local AWS team and can get this information before it causes you a problem. They may suggest that you use smaller instances and more of them, or that you will have a better result with a different region or a different EC2 family.

If you had on-demand access to a virtually infinite amount of computing resources, how would your IT and business operate differently? On AWS, resources are available, and you pay for what you use each month. To get the best out of AWS, you should deploy the resources you need as you need them, and cast off the implicit implication of purchased on-premises IT.

Posted in General | 2 Comments

AWS Surprises – You still need infrastructure architecture on AWS

Posted on June 30, 2020 by Alastair

It is a popular idea that “the cloud means I don’t have to care” however, nothing could be further from the truth. It isn’t really an AWS Surprise to me that infrastructure architecture is still essential for many customers on AWS. Naturally, there are many infrastructure elements that AWS manages; You don’t need to worry about racking and cabling servers or power and cooling. You do still need to choose VM resources (EC2 instance families and sizes) for each application component. You do need to design the network connectivity and isolation when you put together a VPC. Applications that ran on-premises, which you migrate to AWS, will require cloud infrastructure that replicates the on-premises infrastructure.

Similarly, applications built to on-premises architectures will require similar infrastructure on AWS. On-premises infrastructure architects can augment their skills to design infrastructure on AWS. Like any new platform, you will need to learn the capabilities and limitations of the AWS platform. You can find a few of the things I learned on my AWS Surprises page. One thing to prepare for: moving up the stack. Expect to learn more about application and integration architecture as the infrastructure becomes more of a commodity.

No Infrastructure

Not everything on AWS requires conventional infrastructure; more serverless application components mean less infrastructure. It is entirely possible to build large and complex applications on AWS without requiring a single EC instance or subnet. Services like Lambda, DynamoDB, API Gateway, and you can even assemble older services like S3, SQS, and SNS into a microservices-based application without a single VM. These services do not exist in on-premises enterprise datacentres. Only applications developed specifically on AWS will use these services. With a fully serverless application, there is a large amount of application architecture to design rather than infrastructure architecture.

Assumed Infrastructure

One thing to watch for is elements that are provided by on-premises infrastructure that are not automatically delivered by AWS. One example is data protection for backup/recovery, compliance, and disaster recovery. On AWS, these capabilities must be added to or configured for the services, where on-premises, they are often just a fundamental part of the infrastructure. Even if there is no infrastructure to design to support functional requirements, often there are non-functional requirements that the infrastructure team would usually handle.

Posted in General | Comments Off

New Zealand Is like the Boy in a Bubble

Posted on June 17, 2020 by Alastair

You may have seen the new, New Zealand has no active COVID-19 cases, the coronavirus has been eliminated from New Zealand. As of Monday, 8 June, the last infected person had recovered, and it has been over three weeks since the last new case was diagnosed. We have moved from having some of the strictest lockdown rules to totally relaxed, at least within the country. There is almost no risk of COVID-19 transmission inside New Zealand, so we are now protecting ourselves at the border. Anybody arriving in New Zealand is subject to a two-week, government-controlled, quarantine and a COVID test. We have very little immunity to COVID in New Zealand, only 1,100 or so confirmed cases out of five million people. We now live in a bubble, surrounded by countries that still have active transmission, and any breach of our bubble will cause us to go back to lockdown. We will not be safe to leave the bubble until other counties eliminate COVID or a vaccine is widespread.

Continue reading →

Posted in General | 1 Comment

I Want Network Integration, I’m Not Getting It

Posted on June 4, 2020 by Alastair

I like having consistent management interfaces and having a single operational model across as much of my IT estate as possible. I don’t like point solutions that function or are managed differently; they add up to more problems. With this in mind, I would like to see far deeper network integration between AWS and VMware Cloud on AWS (VMC) even though I know why I won’t get this integration for a while. At Cloud Field Day 7, we had two sessions that focussed on network connectivity between AWS (AWS presentation) and VMC (VMware presentation); neither said it works the same as everything else they offer.

Continue reading →

Posted in General | Comments Off

AWS VPC networking Basics Series

Capacity Is Never Infinite

No Infrastructure

Assumed Infrastructure

Past posts