AWS Deep Dive

AWS Well-Architected Framework

The Pillars of the Framework

Performance Efficiency

Design Principles

It’s a little unclear to me how “performance efficiency” is different than “cost optimization”, as all of the efficiency pillars are basically about reducing head count or discrete infrastructure deployments.

Best Practices

For compute, AWS options can be thought of as a hierarchy, with EC2 → ECS/EKS → Lambda, where each step simplifies deployment and maintenance needs at the expense of control over the application’s environment.

For storage, a similar hierarchy is probably EBS → EFS → S3, where each step behaves less-and-less like physical/local storage, but provides file access to more-and-more systems.


Basically, computing systems remain one place where trying to “keep up with the Joneses” is not a bad thing. The landscape changes fast, so it’s worth periodically re-evaluating the solution landscape.


A good point in here about the need to periodically test any alarm/monitoring solution.


Cost Optimization

Design Principles

As best I can determine, the difference between “performance efficiency” and “cost optimization” is what metrics you’re monitoring…?

Best Practices
Practice Cloud Financial Management

Honestly, it’s not clear what this means outside of “make sure that incentives are aligned and people are using the tools at their disposal,” which is kind of a pedestrian observation.

Expenditure and Usage Awareness

Again, this section seems somewhat pedestrian: “Make sure that costs are attributed internally and continuously monitor budgets.” This is very much something that every organization should be doing.

Additional points here about tying costs to (team) objectives, and ensuring that infrastructure ownership includes end-of-life decommissioning. Again, this all seems a bit pedestrian.

Perhaps the big innovation that AWS brings here is that cost reporting and monitoring can be incredibly granular — think hour-by-hour.

Cost-Effective Resources

(Note that it’s implied here that most — if not all — AWS services are wrappers around some combination of EC2, EBS, and S3. I suspect that this isn’t 100% the case, but it’s almost certainly true more often than not.)

Again, regular architectural reviews are important!

Manage Demand and Supply Resources

Shorter section: Asynchronous batch- and queue-based processing is good, and you should use it.

Optimize Over Time

More importantly, are you periodically evaluating your existing services?


So, this is about environmental sustainability… Though surely isn’t energy and resource consumption implicitly integrated into service pricing?

Design Principles

Unsurprisingly, less is generally more when it comes to computing — the fewer resources you utilize (instances, storage, transactions), generally the less your environmental impact. Unlike many personal usage decisions, however, there’s a benefit again to continually upgrading your infrastructure: Newer services, instance types, etc. are generally more resource-efficient than older implementations.

Really, most of this will be reflected in ordinary pricing. The only thing the sustainability options add is a potential bias towards upgrading more quickly, and a different/additional set of KPIs to track.

Oh, and minimize the need to end-users to upgrade or use high-powered hardware. It’d be nice if more companies thought of this bit.

Best Practices
Region Selection

The gist here is to choose regions where energy is produced more sustainably, either because the grid in general is hooked into more sustainable energy sources, or because Amazon has built its own project. I wonder if this is listed in any obvious place when selecting a region to spin up a resource in from the AWS Console…

User Behavior Patterns

That said, all the suggestions in this sections are also ones that can be made from a cost optimization or user experience perspective.

Software and Architecture Patterns

Once again: Batching and queuing is good (smooths out resource usage, and lets fewer resources operate at near higher utilization), requiring customers to upgrade their devices is bad. Also, continually re-analyze and re-optimize your infrastructure.

These are again all suggestions that could live in other sections — there’s not really anything new here, just additional justifications for existing best practices.

Data Patterns

Less data → more sustainable. To this I’d add “more secure” too (you can’t leak what you don’t have).

Also, slower data storage methods are generally more energy efficient, so offloading data to the slowest acceptable storage medium helps.

Hardware Patterns

Interesting to note that specialized EC2 instances (for example, ML-optimized instances) are generally more energy efficient for the given task — not just faster — than general-purpose instances. Also, the further you operate down the EC → ECS/EKS → Lambda pipeline, the more AWS auto-optimizes and auto-upgrades back-end capacity to optimize for energy efficiency. (Of course, you also start to lose autonomy, especially with the ECS/EKS → Lambda shift…)

Development and Deployment Patterns

Managed device farms for testing keeps coming up over the last few sections… I wonder if this is another AWS offering?

The Review Process

The review of architectures needs to be done in a consistent manner, with a blamefree approach that encourages diving deep. It should be a light weight process (hours not days) that is a conversation and not an audit.


Interesting terminology:

I tend to think of this personally as “keeping my options open”.

Often, we find that reviews are the first time that a team truly understands what they have implemented.


Shorter section: Review early, review often.