Scalability means that an application can handle greater loads by adapting.
Two Types: Vertical and Horizontal.
Vertical means increasing the size of the instance. Common for non-distributed systems.
Horizontal means increasing the number of instances / systems for your application. Common for distributed systems.
It is linked but different to High Availability.
High Availability goes hand in hand with horizontal scaling.
It means running your app / system in at least 2 Availability Zones.
The goal is to survive a data center loss.
In terms of EC2:
Vertical Scaling (scale up / down) from t2.nano (0.5G of RAM, 1 vCPU) to u-12tb1.metal (12.3TB of RAM, 448 vCPUs)
Horizontal Scaling (scale out / in):
Auto Scaling Group.
Load Balancer.
High Availability: Run instances for the same application across multi AZ.
Auto Scaling Group multi AZ.
Load Balancer multi AZ.
Scalability vs Elasticity vs Agility
Scalability: ability to accommodate a larger load by making the hardware stronger (scale up), or by adding nodes (scale out).
Elasticity: once a system is scalable, elasticity means that there will be some "auto-scaling" so that the system can scale based on the load. This is "cloud-friendly": pay per use, match demand, optimise costs.
Agility: (a distract item, it is not related to scalability or elasticity). new IT resources are only a lick away, which means that you reduce the time to make those resources available to your developers from weeks to just minutes.
ELB
Load balancers are servers that forward internet traffic to multiple servers (EC2 Instances) downstream.
Managed by AWS.
Why use one?
Spread out load across multiple downstream instances.
Expose single point of access (DNS) to your application.
Seamlessly handle failures of downstream instances.