Section 4 - High Availability & Scalability

Usually complements horizontal scaling
High availability usually implies running your app in at least 2 data centres (availability zones)
High availability can be passive (ex. RDS Multi AZ) or active (ex.horizontal scaling)
In EC2, you can use ASG multi AZ and load balancer multi AZ

An ELB is a managed load balancer
- AWS guarantees its uptime, upgrades, maintenance and high availability
Cheaper to setup but more effort
Integrated with many other AWS services (ex. EC2, ECS, Certificate Manager, CloudWatch, Route 53, WAF, Global Accelerator, etc.)

Important for load balancers, need to know if instances receiving the traffic are available to reply to requests
Health checks are done on a port and a route (/health is common)
If the response is not 200, then the instance is unhealthy

Classic: HTTP, HTTPS, TCP, SSL
- TCP / HTTP based health checks
- Fixed hostname
Application: HTTP, HTTPS, WebSocket
- Load balance multiple HTTP apps across target groups or same machine
  - Great for micro-services and container-based apps
- Health checks are at the target group level
- Port mapping feature to redirect to a dynamic port in ECS
- Fixed hostname
- App servers don't see client IP directly, the true IP is inserted in the X-Forwarded-For header
Network: TCP, TLS, UDP
- Handle millions of requests per second at very low latency
- Has one static IP per AZ and supports assigning Elastic IP
- No free tier in AWS
Gateway: IP
- Deploy, scale and manage multiple 3rd party network virtual appliances
- Combines transparent network gateway to provide a single entrance/exit for traffic and a load balancer to distribute the traffic
- Uses GENEVE protocol on port 6081
- Targets EC2 instances and private IP addresses

Stickiness ensures clients are redirected to the same instance behind load balancers, therefore, retaining their session data
Works for all load balancer types except gateway
The sticky cookie expiration date is managed in CLB and ALB
Enabling stickiness may introduce imbalance to the load over the backend EC2 instances
Application-based cookies
- Custom
  - Generated by the target
  - Includes any custom attributes required by the app
  - Cookie name must be specified for each target group, don't use reserved names such as AWSALB, AWSALBAPP or AWSALBTG
- Application
  - Generated by load balancer
  - Cookie name is AWSALBAPP
Duration-based cookies
- Generated by load balancer
- Cookie name is AWSALB (ALB) or AWSELB (CLB)

Each load balancer distributes traffic evenly across instances in an AZ, rather than distributing requests in the instances of a node group
Enabled by default in ALB (disabled in other types), no charges for inter-AZ data
In NLB and GLB, you pay charges for inter-AZ data if enabled, this is not the case in CLB

An SSL Certificate allows traffic between clients and your load balancers to be encrypted in transit
TLS certificates are mainly used but people still refer to it as SSL
Public SSL certificates are issued by Certificate Authorities, they must be renewed and you must set an expiration date
Load balancers use an X.509 certificate, manageable via Certificate Manager
- Alternatively, you can use your own certificates
For an HTTPS listener:
- Specify a default certificate
- Add an optional list of certificates to support multiple domains
- Clients can use a Server Name Indication to specify the target hostname
- Ability to specify a security policy to support older versions of SSL/TLS

SNI solves the issue of loading multiple SSL certifications onto a web server to server multiple websites
Requires the client to indicate the hostname of the target server in the initial SSL handshake
- Server will then find the correct certificate or return the default one
Only works for CloudFront, ALB and NLB

CLB can only support one SSL certificate
- Must use multiple CLB for multiple hostnames with multiple SSL certificates
ALB and NLB supports multiple listeners with multiple SSL certificates, it uses SNI

Feature naming:
- Connection Draining for CLB, Deregistration Delay for ALB and NLB
The time to complete in-flight requests while the instance is de-registering or unhealthy
- Usually 1 to 3600 seconds (default of 300 seconds)
- Can be disabled by setting value to 0
- Set a low value for short requests

Scale in/out EC2 instances to match the required load
Ensure minimum/maximum number of EC2 instances running
Automatically register new instances to a load balancer
Re-create EC2 instance if a previous one is terminated (could be due to unhealthiness)
ASG are free, only paying for the underlying instances
Attributes:
- Launch Template configurations:
  - AMI, Instance Type
  - EC2 User Data
  - EBS volumes
  - Security groups
  - SSH Key Pair
  - IAM Roles for instances
  - Network and subnet information
  - Load balancer information
- Minimum, maximum and initial capacity
- Scaling policies
Use CloudWatch to scale ASG by monitoring metrics such as Average CPU or a custom metric

Dynamic scaling
- Target tracking
  - Simple setup
  - Ex. want the ASG CPU around 40%
- Simple / step scaling
  - CloudWatch alarm triggered, CPU > 70%, add 2 units
  - CloudWatch alarm triggered, CPU < 30%, remove 1 unit
Scheduled scaling
- Anticipate scaling based on usage patterns
Predictive scaling
- Continuously forecast load and schedule scaling ahead
Good metrics:
- CPUUtilization: Average CPU utilization across instances
- RequestCountPerTarget: Ensure number of requests per EC2 instances is stable
- Average Network In / Out
- Custom metric pushed using CloudWatch
Scaling cooldown period is 300 seconds by default
- Will not launch of terminate instances during this period