Graduate Program KB

Section 4 - High Availability & Scalability

  • Scalability is the adaptability of a system / app to handle greater loads

Vertical Scalability

  • Increasing size of an instance (ex. more compute power)
  • Common for non-distributed systems such as a database
  • RDS, ElastiCache are services that can scale vertically

Horizontal Scalability

  • Increasing number of instances
  • Common for distributed systems such as web apps
  • In EC2, you can use auto scaling groups and load balancers

High Availability

  • Usually complements horizontal scaling
  • High availability usually implies running your app in at least 2 data centres (availability zones)
  • High availability can be passive (ex. RDS Multi AZ) or active (ex.horizontal scaling)
  • In EC2, you can use ASG multi AZ and load balancer multi AZ

Load Balancers

  • Load balancers are servers that forward traffic to multiple servers downstream
  • Benefits:
    • Spread load across multiple downstream instances
    • Expose DNS to your app
    • Easily handle failures of downstream instances
    • Regular health checks
    • Provide SSL termination (HTTPS) for your websites
    • Enforce stickiness with cookies
    • High availability across zones
    • Separate public traffic from private traffic

Elastic Load Balancer

  • An ELB is a managed load balancer
    • AWS guarantees its uptime, upgrades, maintenance and high availability
  • Cheaper to setup but more effort
  • Integrated with many other AWS services (ex. EC2, ECS, Certificate Manager, CloudWatch, Route 53, WAF, Global Accelerator, etc.)

Health Checks

  • Important for load balancers, need to know if instances receiving the traffic are available to reply to requests
  • Health checks are done on a port and a route (/health is common)
  • If the response is not 200, then the instance is unhealthy

Types of Load Balancer

  • Classic: HTTP, HTTPS, TCP, SSL
    • TCP / HTTP based health checks
    • Fixed hostname
  • Application: HTTP, HTTPS, WebSocket
    • Load balance multiple HTTP apps across target groups or same machine
      • Great for micro-services and container-based apps
    • Health checks are at the target group level
    • Port mapping feature to redirect to a dynamic port in ECS
    • Fixed hostname
    • App servers don't see client IP directly, the true IP is inserted in the X-Forwarded-For header
  • Network: TCP, TLS, UDP
    • Handle millions of requests per second at very low latency
    • Has one static IP per AZ and supports assigning Elastic IP
    • No free tier in AWS
  • Gateway: IP
    • Deploy, scale and manage multiple 3rd party network virtual appliances
    • Combines transparent network gateway to provide a single entrance/exit for traffic and a load balancer to distribute the traffic
    • Uses GENEVE protocol on port 6081
    • Targets EC2 instances and private IP addresses

Sticky Sessions

  • Stickiness ensures clients are redirected to the same instance behind load balancers, therefore, retaining their session data
  • Works for all load balancer types except gateway
  • The sticky cookie expiration date is managed in CLB and ALB
  • Enabling stickiness may introduce imbalance to the load over the backend EC2 instances
  • Application-based cookies
    • Custom
      • Generated by the target
      • Includes any custom attributes required by the app
      • Cookie name must be specified for each target group, don't use reserved names such as AWSALB, AWSALBAPP or AWSALBTG
    • Application
      • Generated by load balancer
      • Cookie name is AWSALBAPP
  • Duration-based cookies
    • Generated by load balancer
    • Cookie name is AWSALB (ALB) or AWSELB (CLB)

Cross-Zone Load Balancing

  • Each load balancer distributes traffic evenly across instances in an AZ, rather than distributing requests in the instances of a node group
  • Enabled by default in ALB (disabled in other types), no charges for inter-AZ data
  • In NLB and GLB, you pay charges for inter-AZ data if enabled, this is not the case in CLB

SSL/TLS

  • An SSL Certificate allows traffic between clients and your load balancers to be encrypted in transit
  • TLS certificates are mainly used but people still refer to it as SSL
  • Public SSL certificates are issued by Certificate Authorities, they must be renewed and you must set an expiration date
  • Load balancers use an X.509 certificate, manageable via Certificate Manager
    • Alternatively, you can use your own certificates
  • For an HTTPS listener:
    • Specify a default certificate
    • Add an optional list of certificates to support multiple domains
    • Clients can use a Server Name Indication to specify the target hostname
    • Ability to specify a security policy to support older versions of SSL/TLS

Server Name Indication (SNI)

  • SNI solves the issue of loading multiple SSL certifications onto a web server to server multiple websites
  • Requires the client to indicate the hostname of the target server in the initial SSL handshake
    • Server will then find the correct certificate or return the default one
  • Only works for CloudFront, ALB and NLB

SSL Certificates for ELBs

  • CLB can only support one SSL certificate
    • Must use multiple CLB for multiple hostnames with multiple SSL certificates
  • ALB and NLB supports multiple listeners with multiple SSL certificates, it uses SNI

Connection Draining

  • Feature naming:
    • Connection Draining for CLB, Deregistration Delay for ALB and NLB
  • The time to complete in-flight requests while the instance is de-registering or unhealthy
    • Usually 1 to 3600 seconds (default of 300 seconds)
    • Can be disabled by setting value to 0
    • Set a low value for short requests

Auto Scaling Group

  • Scale in/out EC2 instances to match the required load
  • Ensure minimum/maximum number of EC2 instances running
  • Automatically register new instances to a load balancer
  • Re-create EC2 instance if a previous one is terminated (could be due to unhealthiness)
  • ASG are free, only paying for the underlying instances
  • Attributes:
    • Launch Template configurations:
      • AMI, Instance Type
      • EC2 User Data
      • EBS volumes
      • Security groups
      • SSH Key Pair
      • IAM Roles for instances
      • Network and subnet information
      • Load balancer information
    • Minimum, maximum and initial capacity
    • Scaling policies
  • Use CloudWatch to scale ASG by monitoring metrics such as Average CPU or a custom metric

ASG Scaling Policies

  • Dynamic scaling
    • Target tracking
      • Simple setup
      • Ex. want the ASG CPU around 40%
    • Simple / step scaling
      • CloudWatch alarm triggered, CPU > 70%, add 2 units
      • CloudWatch alarm triggered, CPU < 30%, remove 1 unit
  • Scheduled scaling
    • Anticipate scaling based on usage patterns
  • Predictive scaling
    • Continuously forecast load and schedule scaling ahead
  • Good metrics:
    • CPUUtilization: Average CPU utilization across instances
    • RequestCountPerTarget: Ensure number of requests per EC2 instances is stable
    • Average Network In / Out
    • Custom metric pushed using CloudWatch
  • Scaling cooldown period is 300 seconds by default
    • Will not launch of terminate instances during this period