AWS re:Invent 2015: Another Day, Another Billion Packets (YouTube)
VPC was originally created to help Amazon migrate their data centers to AWS.
EC2 originally just assigned new instances random IP addresses in the 10.44.x.x range. Every new instance got its own IP address, and there was no concept of unified subnets.
VPCs solve this! Every VPC gets a non-routable network that you choose. VPC subnets can be anything, so long as they (1) don’t overlap and (2) are entirely contained within the VPC network.
VPCs are not VLAN based - there are at most 4096 separate VLAN tags.
VPCs are identified internally with a 128 bit number (8 hex digits).
The VPC mapping service is basically an ARP cache on steroids that understands what EC2 instances and VPCs are and on which physical servers the instances are located. But the mapping service doesn’t just provide L2 information - servers receiving packets also verify that those packets are authentic (that the packet’s server/instance/VPC tuple matches an existing system).
The mapping service also functions as a “virtual gateway”, providing routing information between subnets. One interesting consequence of this is that, while in a traditional network inter- and intra-network routing looks slightly different, in AWS this routing is exactly the same once packets leave a physical server. L2 and L3 routing are essentially unified within AWS.
In practice, the every server contains a dedicated system that caches from the mapping service. In fact, these devices pro-actively cache from the mapping service as instances are spun up within a VPN. All queries are handled by these caches.
There are two types of caches - caches to individual hosts (EC2) instances, and caches to “edges”, which map to other networks. Direct Connect, VPNs, and internet gateways are implemented at edges.
Edges also function as a 1-to-1 NAT in their role as internet gateways.
Edge devices are called “Blackfoot”, after the South African Blackfoot Penguin.
The last non-AWS Amazon web server was deactivated on November 10, 2010. Since then, Amazon has run 100% on AWS. And Amazon uses the same EC2 instances as everyone else.
Edges can also now route to S3, enabling S3 buckets to be exposed privately within a VPC. Packets routed to S2 buckets configured in this fashion never traverse the public internet - they go out to an edge device and then are routed from there to S3.
S3 buckets can be restricted to particular VPCs, and EC2 instances can be restricted to accessing particular EC2 buckets. All of this is done with the standard Amazon (“Aspen”) policy document structure.