The design principle to use caching is not simply an AWS principle, it is a common application design principle. A cache is a temporary storage location for a small amount of data that improves application performance. Sometimes the cache is distributed around the world to be close to users, then it might be called a content delivery network. Other times the cache is simply extra memory in your web or application servers that holds some status information about currently active users. The idea that a cache is temporary is important, it is not the persistent storage location for the data. If the cache gets lost, the data in the cache can be re-created from a persistent location. The idea that the cache is not the definitive source is also important, data in the cache represents a copy of the persistent data at some point in the past. Some data can stay in the cache for a long time, the temperature recorded at 10 am yesterday will never change. The current temperature will change, so the current temperature shouldn’t be kept in cache for long, it should have a short Time To Live (TTL).
There are a few trigger points for considering adding a cache, usually centered around needing more application performance for transactional (rather than analytics) workloads. If increasing the database or other persistence tier performance tier seems inefficient, you might feel that you are not getting consistent value for money, then caching might be a good option. This can also be a trigger for considering a different database for a subset of the data, I mentioned this in a previous principle. As an infrastructure person, I am used to providing transparent caches, where the application code is unaware of the cache. But software developers often use explicit caches, where the application code makes choices about what data to place in the cache and when to update or remove cached data. On AWS, the ElastiCache service provides RAM-based caching which developers can choose to use within their application. Because it is an explicit cache, the application developer chooses what data to cache, whether to write to the cache on database updates or only on reads. There is a lot of developer effort to get the most out of ElastiCache, but the potential performance improvement is huge.
Caching is an important tool for improving application performance, everywhere from the end access device (user’s laptop or phone), through the application servers and to the persistent storage at the back. Efficient use of caching does require good design and the more application awareness you bring to that design the more efficiently you can use the expensive cache. Allocating excess RAM to application servers is a simple but inefficient way to provide caching, particularly for applications that you cannot get rewritten.
© 2021, Alastair. All rights reserved.