Graduate Program KB

Elastic Load Balancing & Auto Scaling Groups

  • Scalability: Ability to accommodate a larger load by making hardware stronger (scale up) or by adding nodes (scale out)

    • Vertical scalability: Implies non-distributed systems, such as a database
      • Increasing size of the instance
    • Horizontal scalability: Implies distributed systems
      • Increasing the number of instances / systems for your application
  • High Availability: Running your application / system in at least 2 AZ

    • Usually goes hand-in-hand with horizontal scaling
    • Purpose is to mitigate damage from disasters, such as a data centre loss
  • Elasticity: Given a scalable system, elasticity implies there's auto-scaling such that the system scales based on the load

    • Pay per use, match demand, optimised costs
  • Agility: New IT resources are easily attainable, reducing the time needed to make resources available to developers from weeks to minutes

Elastic Load Balancing

  • Load balancers: Servers that forward internet traffic to multiple EC2 instances downstream

  • Benefits of using a load balancer:

    • Spread load across multiple downstream instances
    • Expose a single point of access (DNS) to your application
    • Easily handle failures of downstream instances
    • Do regular health checks to your instances
    • Provide SSL termination (HTTPS) for your websites
    • High availability across zones
  • Elastic Load Balancer (ELB): A managed load balancer

    • AWS guarantees it will be working, takes care of upgrades, maintenance and high availability
    • But AWS only provides a few configuration knobs
  • Costs less to set up a load balancer but more effort (maintenance, integrations)

  • AWS offers 4 kinds of load balancers:

    • Application Load Balancer (HTTP / HTTPS / gRPC): Layer 7
      • Useful scenarios: Need HTTP Routing features, for static DNS (need static URL)
    • Network Load Balancer (TCP / UDP / TLS): Layer 4
      • High performance, can handle millions of requests per second with low-latency
      • Useful scenarios: Static IP through Elastic IP
    • Gateway Load Balancer (GENEVE on IP packets): Layer 3
      • Useful scenarios: Route traffic to firewalls that you manage on EC2 instances (classic firewall, intrusion detection, deep packet inspection)
    • Classic Load Balancer (retired in 2023): Layer 4 & 7

Auto Scaling Group

  • Load on websites and applications can change, the cloud enables you to create and remove servers quickly

  • Goal of Auto Scaling Group (ASG):

    • Scale out (add EC2 instances) to match an increased load
    • Scale in (remove EC2 instances) to match a decreased load
    • Ensure we have a minimum and maximum number of machines running
    • Automatically register new instances to a load balancer
    • Replace unhealthy instances
  • Only runs at an optimal capacity which saves costs

  • Scaling strategies are useful when your load has predictable time-based patterns

    • Manual Scaling: Update the size of an ASG manually
    • Dynamic Scaling: Respond to changing demand
      • Simple / Step Scaling:
        • Ex. Given a minimum and maximum threshold representing load, when exceeded, either add or remove servers
      • Target Tracking Scaling:
        • Ex. Want the average ASG CPU to stay at around 40%
      • Scheduled Scaling: Anticipate a scaling based on known usage patterns
        • Ex. Increase minimum capacity to 10 at Friday 5pm
    • Predictive Scaling: Using machine learning to predict future traffic
      • Automatically provisions the correct number of EC2 instances in advance