Elastic Load Balancing & Auto Scaling Groups
-
Scalability: Ability to accommodate a larger load by making hardware stronger (scale up) or by adding nodes (scale out)
- Vertical scalability: Implies non-distributed systems, such as a database
- Increasing size of the instance
- Horizontal scalability: Implies distributed systems
- Increasing the number of instances / systems for your application
- Vertical scalability: Implies non-distributed systems, such as a database
-
High Availability: Running your application / system in at least 2 AZ
- Usually goes hand-in-hand with horizontal scaling
- Purpose is to mitigate damage from disasters, such as a data centre loss
-
Elasticity: Given a scalable system, elasticity implies there's auto-scaling such that the system scales based on the load
- Pay per use, match demand, optimised costs
-
Agility: New IT resources are easily attainable, reducing the time needed to make resources available to developers from weeks to minutes
Elastic Load Balancing
-
Load balancers: Servers that forward internet traffic to multiple EC2 instances downstream
-
Benefits of using a load balancer:
- Spread load across multiple downstream instances
- Expose a single point of access (DNS) to your application
- Easily handle failures of downstream instances
- Do regular health checks to your instances
- Provide SSL termination (HTTPS) for your websites
- High availability across zones
-
Elastic Load Balancer (ELB): A managed load balancer
- AWS guarantees it will be working, takes care of upgrades, maintenance and high availability
- But AWS only provides a few configuration knobs
-
Costs less to set up a load balancer but more effort (maintenance, integrations)
-
AWS offers 4 kinds of load balancers:
- Application Load Balancer (HTTP / HTTPS / gRPC): Layer 7
- Useful scenarios: Need HTTP Routing features, for static DNS (need static URL)
- Network Load Balancer (TCP / UDP / TLS): Layer 4
- High performance, can handle millions of requests per second with low-latency
- Useful scenarios: Static IP through Elastic IP
- Gateway Load Balancer (GENEVE on IP packets): Layer 3
- Useful scenarios: Route traffic to firewalls that you manage on EC2 instances (classic firewall, intrusion detection, deep packet inspection)
- Classic Load Balancer (retired in 2023): Layer 4 & 7
- Application Load Balancer (HTTP / HTTPS / gRPC): Layer 7
Auto Scaling Group
-
Load on websites and applications can change, the cloud enables you to create and remove servers quickly
-
Goal of Auto Scaling Group (ASG):
- Scale out (add EC2 instances) to match an increased load
- Scale in (remove EC2 instances) to match a decreased load
- Ensure we have a minimum and maximum number of machines running
- Automatically register new instances to a load balancer
- Replace unhealthy instances
-
Only runs at an optimal capacity which saves costs
-
Scaling strategies are useful when your load has predictable time-based patterns
- Manual Scaling: Update the size of an ASG manually
- Dynamic Scaling: Respond to changing demand
- Simple / Step Scaling:
- Ex. Given a minimum and maximum threshold representing load, when exceeded, either add or remove servers
- Target Tracking Scaling:
- Ex. Want the average ASG CPU to stay at around 40%
- Scheduled Scaling: Anticipate a scaling based on known usage patterns
- Ex. Increase minimum capacity to 10 at Friday 5pm
- Simple / Step Scaling:
- Predictive Scaling: Using machine learning to predict future traffic
- Automatically provisions the correct number of EC2 instances in advance