EC2 - SAA Level
EC2 Instance Storage
High Availability and Scalability: ELB & ASG
1. Elastic Load Balancer
RDS, Aurora, ElastiCache
Route 53
1. Health Checks
2. Routing Policies
Amazon S3 Introduction
Amazon S3 Advanced
1. Performance
2. Storage Lens
S3 Security

EC2 - SAA Level

Elastic IPs - A fixed public IPv4 for your instance
- Owned by you until you delete it
- Can attach to one instance at a time
- Can be used to mask the failure of an instance or software by rapidly remapping the address to another instance in your account
- You can only have 5 Elastic IPs in your account
- Try to avoid using Elastic IP if you can
  - They often reflect poor architectural decisions
  - Instead use a random public IP and register a DNS name to it
By default, your EC2 instance comes with
- A private IP (for the interal AWS network)
- A public IP (for the web and SSH. Can change if instance is stopped and started)
Placement Groups define your EC2 instance placement strategy, which will be one of the following:

Placement Group	Description	Pros	Cons	Use Cases
Cluster	Clusters instances into a low-latency group in a single AZ	Great Network (10Gbps bandwidth between instances with Enhanced Networking enabled)	If the AZ fails, all instances fail at the same time	Big Data job that needs to be complete fast Application that needs extremely low latency and high network throughput
Spread	Spreads instances across underlying hardware	Can span across multiple AZs Reduced risk of simultaneous failure EC2 instances are on different physical hardware	Limited to 7 instances per AZ per placement group	Application that needs to maximize high availability Critical Applications where each instance must be isolated from failure from each other
Partition	Spreads instances across many different partitions (on different sets of racks) within an AZ. Can span across multiple AZs in the same region.	Scales up to 100s of EC2 instances Instances are protected from failure of racks using other partitions	A partition failure can still affect many EC2 instances	HDFS HBase Cassandra Kafka

Elastic Network Interfaces (ENI) are logical components in VPCs that represent virtual network cards.
- Can have the following attributes:
  - Primary private IPv4, one or more secondary IPv4
  - One Elastic IP (IPv4) per private IPv4
  - One Public IPv4
  - One or more security groups
  - A MAC address
- You can create ENI independently and attach them on the fly (move them) on EC2 instances for failover
- Bound to a specific AZ
EC2 Hibernate is an alternative to stopping and terminating for EC2 instances where:
- Volatile memory is preserved (RAM state is written to a file in the root EBS volume)
- The instance boot is much faster as the OS isn't stopped/restarted
- The root EBS must be encrypted
- Supported when:
  - EC2 instance type supports hibernation (to get the list of types with hibernation you can use the command aws ec2 describe-instance-types --filters Name=hibernation-supported,Values=true --query "InstanceTypes[*].[InstanceType]" --output text | sort)
  - AMI supports hibernation (listed here)
  - Instance RAM Size is less than 150GB
  - Instance is not bare metal
  - Root volume is a large encrypted EBS volume
  - Instance is an On-Demand, Reserved or Spot Instance
- Instances cannot be hibernated more than 60 days
EC2 Hibernate use cases:
- Long-running processing
- Saving the RAM state
- Services that take time to initialize

EC2 Instance Storage

EBS

An Elastic Block Store (EBS) volume is a network drive you can attach to instances as they run. They allows your instances to persist data even after termination
- Uses the network to communicate to the instance, meaning reads/writes have some latency
- Can be detached from an instance and attached to another quickly
- They can only be mounted to one instance at a time (at the CCP level)
- They are bound to a specific AZ. To move a volume across you need to snapshot it
- Free tier: 30GB of free EBS storage of type General Purpose (SSD) or Magnetic per month
- They have a provisioned capacity (size in GBs and IOPS) which can be changed. You get billed for all provisioned capacity.
EBS Volumes have a Delete on Termination attribute that determines if the volume should be deleted if the EC2 instance it is attached to is deleted
- By default, the root EBS volume will be deleted
- By default, any other attached EBS volume is not deleted
EBS Snapshots allow you to make a backup of your EBS volume at a given time.
- Snapshots can be copied across AZs and Regions
- It's not necessary to detach the volume to snapshot, but it's recommended
- Features:
  - EBS Snapshot Archive - Move a Snapshot to an archive tier that is 75% cheaper. Takes within 24-72 hours to restore the archive
  - Recycle Bin for EBS Snapshots - Allows you to setup rules to retain deleted snapshots so you can recover them after an accidental deletion. You may specify a retention period (from 1 day to 1 year)
  - Fast Snapshot Restore (FSR) - Force a full initialization of snapshot to have no latency on the first use for a fee.
Amazon Machine Images (AMIs) are customizations of EC2 instances, allowing you to add your own software, configuration, OS, monitoring, etc.
- AMIs have a faster boot/configuration time because all your software is pre-packaged
- AMIs are built for a specific region and can be copied across regions
- You can launch EC2 instances from
  - A public AMI (AWS provided)
  - Your own AMI
  - An AWS Marketplace AMI (an AMI provided/sold by an AWS user)
- Process: 1. Start an EC2 instance and customize it 2. Stop the instance for data integrity 3. Build an AMI (this will also create EBS snapshots) 4. Launch instances from other AMIs

EC2 Instance Store

EC2 Instance Store volumes are high-performance hardware disks that:
- Offer better I/O performance than EBS volumes
- Lose their storage if they're stopped
- Work well as a buffer/cache for scratch data or temporary content
- Risk data loss if hardware fails
- You are responsible for backing up and replicating
- Come in 6 types:

	General Purpose SSD volumes		Provisioned IOPS SSD volumes
Volume type	`gp3`	`gp2`	`io2` Block Express	`io1`
Durability	99.8% - 99.9% durability (0.1% - 0.2% annual failure rate)		99.999% durability (0.001% annual failure rate)	99.8% - 99.9% durability (0.1% - 0.2% annual failure rate)
Use cases	Transactional workloads Virtual desktops Medium-sized, single-instance databases Low-latency interactive applications Boot volumes Development and test environments		Workloads that require: Sub-millisecond latency Sustained IOPS performance More than 64,000 IOPS or 1,000 MiB/s of throughput	Workloads that require sustained IOPS performance or more than 16,000 IOPS I/O-intensive database workloads
Volume size	1 GiB - 16 TiB		4 GiB - 64 TiB	4 GiB - 16 TiB
Max IOPS per volume	16,000 (64 KiB I/O)	16,000 (16 KiB I/O)	256,000 (16 KiB I/O)	64,000 (16 KiB I/O)
Max throughput per volume	1,000 MiB/s	250 MiB/s	4,000 MiB/s	1,000 MiB/s
Amazon EBS Multi-attach	Not supported		Supported
NVMe reservations	Not supported		Supported (up to 16 instances at once)	Not supported
Boot volume	Supported

When you encrypt an EBS volume:
- Data at rest is encrypted inside the volume
- All the data in flight moving between the instance and the volume is encrypted
- All snapshots are encrypted
- All volumes created from the snapshot are encrypted
Encryption and decryption are handled transparently and have a minimal impact on latency
EBS Encryption leverages keys from KMS (AES-256)
You can encrypt an unencrypted EBS volume:
1. Create an EBS snapshot of the volume
2. Encrypt the EBS snapshot using copy
3. Create a new EBS volume from the snapshot
4. Attach the encrypted volume to the original instance

EFS

Elastic File Systems (EFS) are managed NFSes that can be mounted on many EC2 instances
- Works well with EC2 instances in multi-AZ environment
- Highly available, scalable, expensive, pay per use
- Use cases - Content management, web serving, data sharing, WordPress
- Uses NFSv4.1 protocol
- Uses security group to control access to EFS
- Compatible with Linux-based AMI
- Encryption at rest using KMS
- POSIX file system that has a standard file API
- File system scales automatically

Storage class	Designed for	First byte read latency	Durability (designed for)	Availability SLA	Availability zones	Minimum billing charge per file	Minimum storage duration
EFS Standard	Active data requiring fast sub-millisecond latency performance	Sub-millisecond	99.999999999% (11 9's)	99.99% (Regional) 99.9% (One Zone)	=>3 (Regional) 1 (One Zone)	Not applicable	Not applicable
EFS Infrequent Access	Inactive data that is accessed only a few times each quarter	Tens of milliseconds		99.99% (Regional) 99.9% (One Zone)	=>3 (Regional) 1 (One Zone)	128 KiB	Not applicable
EFS Archive	Inactive data that is accessed a few times each year or less	Tens of milliseconds		99.9% (Regional)	=>3 (Regional)	128 KiB	90 days

High Availability and Scalability: ELB & ASG

Vertical Scalability means increasing the size of the instance (eg. scaling an instance from a t2.micro to a t2.large to meet demand)
- Very common for non-distributed systems, such as a database
- RDS, ElastiCache are services that can scale vertically
- Usually a limit to how much you can vertically scale
Horizontal Scalability means increasing the number of instances / systems for your application
- Implies distributed systems
- Very common for web applications / modern applications
- EC2 instances can be horizontally scaled using ASG and/or ELB
High Availability - Your application can survive a data center loss
- Can be passive (eg. RDS Multi-AZ)
- Can be active (eg. horizontal scaling)
- Can be achieved with a multi-AZ ASG and/or a multi-AZ ELB

Elastic Load Balancer

Load Balancers are servers that forward traffic to multiple servers downstream
Why use a load balancer?
- Spread load across multiple downstream instances
- Expose a single point of access (DNS) to your application
- Seamlessly handle failures of downstream instances
- Do regular health checks to your instances
- Provide SSL termination (decrypting encrypted traffic before passing it along to a web server; HTTPS) for your websites
- Enforce stickiness (session persistence) with cookies
- High availability across zones
- Separate public traffic from private traffic
An Elastic Load Balancer is a managed load balancer integrated with many AWS offerings/services, including:
- EC2, EC2 ASGs, ECS
- AWS Certificate Manager (ACM), CloudWatch
- Route 53, AWS WAF, AWS Global Accelerator
Health Checks are a crucial process for load balancers that enable them to know if instances it forwards traffic to are available to reply to requests
- Health checks are done on a port and a route (usually /health)

Elastic Load Balancer Types

Classic Load Balancers (v1)
- Supports TCP (layer 4), HTTP & HTTPS (layer 7)
- Health checks are TCP or HTTP
- Fixed hostname
Application Load Balancer (v2)
- Supports layer 7 (HTTP/HTTPS)
- Can load balance to multiple HTTP applications across machines (target groups) based on:
  - Path in URL (eg. example.com/users or example.com/posts)
  - Hostname in URL (eg. one.example.com or other.example.com)
  - Query String, Headers (eg. example.com/users?id=123&order=false)
- Can load balance to multiple applications on the same machine (eg. containers)
- Supports HTTP/2 and WebSocket
- Supports redirects (eg. from HTTP to HTTPS)
- Great fit for micro-services & container-based applications
- Has a port mapping feature to redirect to a dynamic port in ECS
- Target Groups can include:
  - EC2 instances
  - ECS tasks
  - Lambda functions (HTTP requests are translated into JSON events)
  - IP addresses (must be private IPs)
- Health checks are performed at the target group level
- Fixed hostname
- The application servers don't see the user's IP, port or protocol directly, so they are inserted in the headers X-Forwarded-For, X-Forwarded-Port and X-Forwarded-Proto respectively.
Network Load Balancer (v2)
- Support layer 4 (TCP & UDP)
- Handles millions of request per second with a lower latency than ALB (~100ms vs ~400ms)
- Has one static IP per AZ and supports assigning Elastic IP (helpful for IP whitelisting)
- NLB are used for extreme performance, TCP or UDP traffic
- Not included in the AWS free tier
- Target Groups can include:
  - EC2 instances
  - IP Addresses (must be private IPs)
  - Application Load Balancers
- Health Checks support the TCP, HTTP and HTTPS protocols
Gateway Load Balancer
- Operates at layer 3 (Network Layer (IP Packets))
- Use to deploy, scale and manage a fleet of 3rd party network virtual appliances such as:
  - Firewall
  - Intrusion Detection and Prevention SYstem
  - Deep Packet Inspection System
  - Payload manipulator
- Works as both a Load Balancer and a Transparent Network Gateway (single entry/exit point for all traffic)
- Target Groups can include:
  - EC2 instances
  - IP Addresses (must be private IPs)

Sticky Sessions

Application-based Cookies are either:
- Custom cookies, which are generated by the target and can include any custom attributes required by the application. Cookie names must be specified individually for each target group
- Application cookies, which are generated by the load balancer and are named AWSALBAPP
Duration-based Cookies are generated by the load balancer, with the names
- AWSALB for ALB, or
- AWSELB for CLB

Cross-Zone Load Balancing

Cross-Zone Load Balancing means that each load balancer instance will be distributed evenly across all registered instances in the AZ. If disabled, requests will be distributed evenly between ELB nodes but not between instances
- Enabled by default for ALBs. No charges for inter-AZ data
- Disabled by default for NLBs & GWLBs. Fees will be charged for inter-AZ data if enabled
- Disabled by default for CLBs. No charges for inter-AZ data.

Load Balancing with SSL/TLS

SSL certificates allow traffic between clients and your load balancer to be encrypted in transit
- SSL = Secure Sockets Layer
- TLS = Transport Layer Security. Newer version of SSL.
- Nowadays TLS certificates are mainly used but SSL is still the common name
- SSL certificates are issues by Certificate Authorities (CA)
- SSL certificates have expiration dates and must be renewed
Load Balancers use X.509 certificates
AWS Certificate Manager - An AWS service you can use to manage certificates
- You may also upload your own certificates instead
HTTPS listener:
- You must specify a default certificate
- You can add an optional list of certs to support multiple domains
- Clients can use SNI to specify the hostname they reach
- You can specify a security policy to support older versions of SSL/TLS for legacy clients
Server Name Indication (SNI) - A protocol that solves the issue of loading multiple SSL certificates onto one web server so the web server can serve multiple websites.
- A client indicates the hostname of the target server in their initial SSL handshake. The server then finds the correct certificate or the default
- Only works for ALB v2, NLB v2 and CloudFront

Connection Draining

Known as Connection Draining for CLB or Deregistration Delay for ALB & NLB
Refers to the time to complete "in-flight" requests while the instance is de-registering or unhealthy
- Between 1-3600 seconds (default 300)
- Can be disabled (set to 0)
- Set this to a low value if your requests are short

Auto Scaling Groups

Auto Scaling Groups allow:
- Scaling in to match increased loads
- Scaling out to match decreased loads
- Minimum/maximum limits for EC2 instance count
- Automatic registration of new instances to load balancers
- Recreation of EC2 instances in case of termination (eg. if the instance is unhealthy)
ASG Attributes:
- Launch Template;
  - AMI + Instance Type
  - EC2 User Data
  - EBS Volumes
  - Security Groups
  - SSH Key Pair
  - IAM Roles for your instances
  - Network + Subnet Information
  - Load Balancer information
- Min size / max size / initial capacity
- Scaling policies
It's possible to scale and ASG based on CloudWatch alarms
An alarm monitors a metric such as Average CPU usage or a custom metric
- Metrics such as Average CPU are computer for the overall ASG instances
- Can create policies to scale in/out upon trigger
ASG Scaling Policies:
- Dynamic:
  - Target tracking
  - Simple/step scaling
- Scheduled
- Predictive
Good metrics to scale on:
- CPU Utilization
- RequestCountPerTarget
- Average Network In/Out
- Any custom metric
Scaling Cooldowns: After a scaling activity happens you enter a cooldown period (default 300 seconds). During this period, the ASG will not launch or terminate additional instances to allow for metrics to stabilize

RDS, Aurora, ElastiCache

RDS (Relational Database Service)

RDS is a managed DB service for DBs that use SQL as a query language, including:
- Postgres
- MySQL
- MariaDB
- Oracle
- Microsoft SQL Server
- IBM DB2
- Amazon Aurora
Why use RDS instead of EC2 for DB hosting? RDS is managed
- Automated provisioning, OS patching
- Continuous backups and point-in-time restore
- Monitoring dashboards
- Read replicas for improved read performance
- Multi AZ setup for disaster recovery
- Maintenance windows for upgrades
- Vertical & Horizontal scaling
- Storage backed by EBS
However, you can't SSH into your instances
On RDS Storage Auto Scaling:
- When RDS detects you're running out of storage, it scales automatically.
- This will happen if free storage is less than 10% of allocated storage AND low-storage lasts at least 5 mintues AND at least 6 hours have passed since the last modification
- You must set a maximum storage threshold
RDS Read Replicas help you scale your database reads
- You can have up to 15 Read Replicas within an AZ, cross-AZ or cross-region
- Replication is async, so reads are eventually consistent
- Replicas can be promoted to their own DB
- Use cases:
  - You have a production application and a reporting application that both need to read from your database at once. You can create a read replica for the reporting application to read from that takes the load off the source database.
- Things to remember:
  - Read replicas are used for SELECT kinds of statements (not INSERT, UPDATE, or DELETE)
  - Applications must update their connection strings to leverage read replicas
  - Cross-region read replicas will incur extra network cost due to the cross-region data transmission
RDS Multi-AZ can help you set up disaster recovery contingencies, increasing availability
- Synchronous replication
- Database instances will be under one DNS name, meaning we can have automatic failover in case of loss of AZ, loss of network, instance or storage failover
- Requires no manual intervention in your app
- Not used for scaling
- How to go from a single-AZ deployment to multi-AZ (zero-downtime operation):
  - Click on "modify" for the DB
  - The following will happen internally:
    - A snapshot is taken of the source DB instance
    - A new DB is restored from the snapshot in a new AZ
    - Synchronization is established between the two DBs
RDS Custom - Managed Oracle and Microsoft SQL Server DB with OS and DB customization
- Gives us access to the underlying DB and OS so we can:
  - Configure settings
  - Install patches
  - Enable native features
  - Access the underlying EC2 isntance using SSH or SSM Session Manager
- REMEMBER: Take a DB snapshot and de-activate automation mode anytime you want to perform customization to prevent data corruption
Backups:
- Automated:
  - Full backup occurs daily during a 'backup window'
  - Transaction log backups occur every 5 minutes (enables point-in-time restore)
  - Retention Period = 1-35 days. Set to 0 to disable automated backups
- Manual Snapshots:
  - Manually triggered by the user
  - Backup is retained for as long as desired
Restoring:
- From a backup/snapshot creates a new DB
- From S3:
  1. Create a backup of your on-prem DB and store it on S3
  2. Restore the backup file onto a new RDS instance running MySQL
⚠REMEMBER⚠: You still pay for storage even if your RDS DB is stopped! If you plan on stopping your DB for a long time, take a snapshot instead
Amazon RDS Proxy
- Allows apps to pool and share DB connections established with the DB
- Improving DB efficiency by reducing the stress on DB resources (eg CPU, RAM) and minimize open connections (and timeouts)
- Serverless, autoscaling, multi-AZ
- Reduced failover time by up to 66%
- Supports RDS and Aurora
- No code changes required for most apps
- Enforce IAM Authentication for DB and securely store credentials in AWS Secrets Manager
- RDS Proxy is never publicly accessible, must be accessed from the VPC
Security Features (Shared with Amazon Aurora):
- At-rest encryption (DB master & replicas encryption using AWS KMS.)
  - If the master is not encrypted, the replicas cannot be encrypted
  - To encrypt an un-encrypted DB, go through as DB snapshot & restore as encrypted
- In-flight encryption (TLS-ready by default, use the AWS TLS root certificates client-side)
- IAM Authentication (IAM roles to connect to your DB)
- Security Groups
- No SSH available (except on RDS Custom)
- Audit Logs can be enabled and sent to CloudWatch Logs for longer retention

Amazon Aurora

Amazon Aurora is a proprietary AWS DB optimized for the cloud that supports Postgres and MySQL
- Storage automatically grows in increments of 10GB up to 128TB
- Can have up to 15 replicas and the replication process is faster than MySQL (sub-10ms replica lag)
- Failover in Aurora is instantaneous
- More efficient than RDS but costs 20% more
- Integrates with AWS ML services like SageMaker and Comprehend
High availability and read scaling
- Stores 6 copies of your data across 3 AZs
  - 4 copies needed for writes
  - 3 copies needed for reads
  - Self-healing with p2p replication
  - Storage is striped across 100s of volumes
- Support for cross-region replication
Aurora as a DB Cluster
- There is a shared storage volume that automatically expands from 10GB to 128TB
- Writing to the shared storage volume is done through communicating with the Writer Endpoint, which points to the master database, which is the only entity that can write to shared storage.
  - If the master fails, automated failover will occur in less than 30 seconds
- To read, connect to the Reader Endpoint, which directs requests to the cluster's set of read-replicas (can have 1-15 at a time), which all read from the shared storage volume
Features of Aurora
- Automatic failover
- Backup and Recovery
- Isolation and security
- Industry compliance
- Push-button scaling
- Automated Patching with Zero Downtime
- Advanced Monitoring
- Routine Maintenance
- Point-in-time restore via the Backtrack feature
Defining Custom Endpoints
- If your application involves many different types of workloads and you want to segregate them (say you want to run analytical queries on specific, more powerful read replica instances but not others), define custom endpoints that communicate with a subset of your replicas.
- The Reader Endpoint is generally not used after defining Custom Endpoints
Serverless Aurora
- Clients communicate with a Proxy Fleet (managed by Aurora), which redirects requests to Aurora DB instances that are instantiated and scaled automatically, which all still read from the shared storage volume
Aurora Global Database (recommended)
- 1 Primary Region (read/write)
- Up to 5 Secondary Regions (read)
- Up to 16 Read Replicas per Secondary Regions
- Promoting a Secondary region to be a Primary has an RTO of < 1 minute
- Typical cross-region replication takes < 1 second
Backups:
- Automated:
  - Cannot be disabled
  - Retained for 1-35 days
- Manual Snapshots:
  - Manually triggered by the user
  - Backup is retained for as long as desired
Restoring:
- From a backup/snapshot creates a new DB
- From S3:
  1. Create a backup of your on-prem DB using Percona XtraBackup and store the backup file on S3
  2. Restore the backup file onto a new Aurora cluster running MySQL
Aurora Database Cloning - Create a new Aurora DB Cluster from an
- Faster than snapshot & restore
- Uses copy-on-write protocol
  - Initially, the new DB cluster uses the same data volume as the original DB cluster. When updates are made to the new DB cluster data, then additional storage is allocated and data is copied to be separated
  - Fast & cost-effective
  - Useful to create a "staging" database from a "production" database without impacting the production database

ElastiCache

ElastiCache is a managed Redis/Memcached service with high performance and low latency
- Helps reduce load off of DBs for read intensive workloads
- Helps make your application stateless
- Usage will involve heavy application code changes
How it works:
1. Applications query ElastiCache
2. If there is a cache hit, retrieve the value from ElastiCache.
3. Otherwise there is a cache miss. Retrieve the value from your DB service
4. Write the result to the cache
Cache must have invalidation strategy to make sure only the most current data is used. REDIS VS. MEMCACHED
Redis
- Multi-AZ with Auto-Failover
- Read Replicas to scale reads and have high availability
- Data Durability using AOF persistence
- Backup and restore features
- Supports Sets and Sorted Sets (use case: real-time leaderboard store)
- Supports IAM Authentication and SSL in-flight encryption
Memcached
- Multi-node for partitioning of data (sharding)
- No replication
- Non-persistent
- No backup and restore
- Multi-threaded architecture
Patterns for ElastiCache:
- Lazy Loading: All the read data is cached, data can become stale in cache
- Write Through: Adds or updates data in the cache when written to a DB (no stale data)
- Session Store: Store temporary session data in a cache (using TTL features)

Route 53

Terminology
- Domain Name System (DNS) - System that translates human-friendly hostnames into IP addresses
- Domain Registrar - Route 53, GoDaddy, etc
- DNS Records - A, AAAA, CNAME, NS
- Zone File - Contains DNS records
- Name Server - Resolves DNS queries
- Top Level Domain (TLD) - .com, .us, .in, .gov, .org...
- Second Level Domain (SLD) - amazon.com, google.com
- Sub Domain - www.example.com
- Fully Qualified Domain Name - api.www.example.com
- URL - https://api.www.example.com
How DNS works (note you wouldn't do this every time, query results are cached with a TTL):
1. Client reaches out to Local DNS server (assigned by your company or ISP) with the desired hostname
2. The Local DNS server reaches out to the Root DNS Server (managed by ICANN) to get the IP of the TLD DNS Server
3. The Local DNS server reaches out to the TLD DNS Server to get the IP of the SLD DNS Server
4. The Local DNS server reaches out to the SLD DNS Server to get the IP of the web server under the hostname, which is finally returned to the client
5. The client starts sending requests to the web server
Amazon Route 53 - Highly available, scalable, fully managed and authoritative DNS and Domain Registrar
- The only service with a 100% availability SLA
Route 53 Records specify how you want to route traffic for a domain. Each record contains:
- Domain/subdomain name
- Record Type
- Value eg. 12.34.56.78
- Routing Policy - How Route 53 responds to queries
- TTL - Amount of time the record cached at DNS Resolvers
Route 53 Record Types:
- A - Maps a hostname to IPv4
- AAAA - Maps a hostname to IPv6
- CNAME - Maps a hostname to another hostname (only for non-root domain)
- NS - Name Servers for the Hosted Zone
- Alias - Maps a hostname to an (automatically resolved) hostname for an AWS resource(works for root/non-root domains)
  - Always of type A/AAAA.
  - Can't set the TTL
  - Targets: ELB, CloudFront Distributions, API Gateway, Elastic Beanstalk environments, S3 Websites, VPC Interface Endpoints, Global Accelerator accelerators, Route 53 records (in the same hosted zone)
Hosted Zones - Container for records that define how to route traffic to a domain and its subdomains
- Public Hosted Zones - Contains records that specify how to route on the internet (public domain names)
- Private Hosted Zones - Contains records that specify how you route traffic within one or more VPCs (private domain names)
Records TTL
- High TTL (eg. 24hr) = Less traffic on Route 53, but possibly outdated records
- Low TTL (eg 60sec) = More traffic on Route 53 (and thus higher costs), less outdated records, makes records easier to change
- Mandatory on all DNS records aside from Alias records
3rd Party Registrar with Amazon Route 53
1. Create a Hosted Zone in Route 53
2. Update NS Records on 3rd party website to use Route 53's Name Servers

Health Checks

Health Checks - Checks endpoint health for public resources
- Can monitor endpoints, other health checks (calculated health checks) and CloudWatch Alarms
- Integrated with CloudWatch metrics
- About 15 global health checkers will check the endpoint health:
  - Healthy/Unhealthy Threshold - 3 (default)
  - Interval - 30 seconds (or 10 seconds at a higher cost)
  - Supported protocols: HTTP, HTTPS, TCP
  - If >18% of checkers report the endpoint is healthy, Route 53 reports it as Healthy. Otherwise, it's Unhealthy.
  - Can choose which locations you want Route 53 to use
- Health Checks pass only when the endpoint responds with 2xx and 3xx status codes. Can also be set up to pass/fail based on text in the first 5120 bytes of the response
- Need to configure your router/firewall to allow incoming requests from Route 53 Health Checkers
- Only works for public endpoints. For private endpoints, you can create a CloudWatch Metric and associate a CloudWatch Alarm, then create a Health Check that checks the alarm itself.
Calculated Health Checks - Combination of results from multiple Health Checks
- Combined with OR, AND or NOT
- Can monitor up to 256 Child Health Checks
- Specify how many of the health checks need to pass to make the parent pass
- Use case: Perform maintenance on your website without causing all health checks to fail

Routing Policies

Simple - Typically used to route traffic to a single resource
- Can specify multiple values in the same record
- If multiple values are returned, a random one is chosen by the client
- When Alias is enabled, specify only one AWS resource
- Cannot be associated with Health Checks
Weighted - Control the % of requests that go to each resource
- Assign each record a relative weight (doesn't need to sum to 100)
- DNS records must have the same name & type
- Use cases - Load balancing between regions, testing new application versions
- Assign a weight of 0 to stop sending traffic to a resource
- If all records have a weight of 0, then all records will be returned equally
Latency-based - Redirect to the resource that has the least latency
- Based on traffic between users and AWS Regions
Geolocation - Route depending on user location (continent/county/US state)
- Should create a "Default" record in case there's no match on location
- Use cases: website localization, restrict content distribution, load balancing...
Geoproximity - Route traffic to your resources based on the geographic location of users and resources
- Can shift more traffic to resources based on the defined bias
- To change the size of the geographic region, specify bias values (-99 to 99)
- Resources can be:
  - AWS resources (specify AWS region)
  - Non-AWS reources (specify latitude and longitude)
  - You must use Route 53 Traffic Flow to use this feature
IP-based - Provide a list of CIDRs for your clients and the corresponding endpoints/locations (user-IP-to-endpoint mappings)
- Use cases - Optimize performance, reduce network costs
Multi-Value - Use when routing traffic to multiple resources to return multiple values to clients
- Can be associated with Health Checks to return only values for healthy resources
- Up to 8 healthy records returned per query
- Not a substitute for an ELB

Amazon S3 Introduction

Amazon S3 Use cases
- Backup and storage
- Disaster Recovery
- Archive
- Hybrid Cloud storage
- Application hosting
- Media hosting
- Data lakes & big data analytics
- Software delivery
- Static websites
S3 allows people to store objects (files) in buckets (directories)
- Buckets must have a globally unique name across all regions and all accounts
- Buckets are defined at the region level
- S3 looks like a global service but buckets are created in a region
S3 Objects consist of:
- A key, which is the full path, composed of a prefix and an object name
- Object values are the content of the body
  - Max object size is 5TB
  - If uploading more than 5GB, must use 'multi-part upload'
- Metadata (list of text key/value pairs - system or user metadata)
- Tags (Unicode key/value pair - up to 10) - useful for security/lifecycle
- Version ID (if versioning is enabled)
S3 security
- User-Based
  - IAM Policies - which API calls should be allowed for a specific user from IAM
- Resource-Based
  - Bucket Policies - Bucket-wide rules from the s3 console - allows cross account.
    - An IAM principal can access an object if the user IAM permissions ALLOW it || the resource policy ALLOWS it && there's no explicit DENY
  - Object ACL - Finer grain access control (can be disabled)
  - Bucket ACL - Less commonly used (can be disabled)
On Bucket Policies:
- JSON based policies:
  - Specify Resources (buckets and objects)
  - Specify the Effect (allow/deny)
  - Specify the Actions the policy applies to
  - Specify the Principal (the account or user) to apply the policy to
- Use S3 bucket policies to:
  - Grant public access to a bucket
  - Force objects to be encrypted at upload time
  - Grant cross-account access
Versioning
- You can version your files in Amazon S3
- Enabled at the bucket level
- Uploading a file with the same key as an existing one will change the "version"
- Enabling versioning is best practice as it protects against unintended deletes and
- NOTES:
  - Any file that is not versioned prior to enabling versioning will have version "null"
  - Suspending versioning does not delete the previous versions
Replication
- You must enable versioning in source and destination buckets
- Cross-Region Replication (CRR) use case: compliance, lower latency access, replication across accounts
- Same-Region Replication (SRR) use case: log aggregation, live replication between production and test accounts
- Buckets can be in different accounts
- Copying is done asynchronously
- Must give proper IAM permissions to S3
- After you enable replication, only new objects are replicated
- You can optionally replicate existing objects using S3 Batch Replication
- For DELETE operations, you can optionally enable replication of delete markers from source to target. Deletions with a version ID are not replicated
- There is no 'chaining' of replication (ie you cannot set up bucket 1 to replicate to bucket 2 and set up bucket 2 to replicate to bucket 3 and expect objects created in bucket 1 to replicate to bucket 3)

Amazon S3 Advanced

Moving between Storage Classes
- You can transition objects between storage classes
- Moving objects between tiers can be automated using Lifecycle Rules
Lifecycle Rules
- Transition Actions - Configure objects to transition to another storage class (eg. move objects to Standard IA class 60 days after creation, move to Glacier for archiving after 6 months)
- Expiration Actions - Configure objects to expire (delete) after some time
  - Can be used to delete old versions of files if versioning is enabled
  - Can be used to delete incomplete multi-part uploads
- Rules can be created for a certain prefix (eg s3://mybucket/mp3/*)
- Rules can be created for certain objects Tags (eg Department: Finance)
S3 Storage Class Analysis - Helps you decide when to transition objects to the right storage class
- Recommendations for Standard and Standard IA (does NOT work for One-Zone IA or Glacier)
- Generates a report that is updated daily
- 24 to 48 hours to start seeing data analysis
- Good first step to put together / improve Lifecycle Rules
With Requester Pays buckets, the requester pays for the cost of data egress instead of the bucket owner
- Helpful when you want to share large datasets with other accounts
- The requester must be authenticated in AWS (cannot be anonymous)
S3 Event Notifications
- Supported event types of SQS, SNS and Lambda:
  - S3:ObjectCreated:Put
  - S3:ObjectCreated:Post
  - S3:ObjectCreated:Copy
  - S3:ObjectCreated:CompleteMultipartUpload
  - S3:ObjectRemoved:Delete
  - S3:ObjectRemoved:DeleteMarkerCreated
  - S3:ObjectRestore:Post
  - S3:ObjectRestore:Completed
  - S3:ObjectRestore:Delete
  - S3:ObjectReplication
  - S3:ReducedRedundancyLostObject
  - S3:Replication:OperationFailedReplication
  - S3:Replication:OperationMissedThreshold
  - S3:Replication:OperationReplicatedAfterThreshold
  - S3:Replication:OperationNotTracked
  - S3:LifecycleExpiration:Delete
  - S3:LifecycleExpiration:DeleteMarkerCreated
  - S3:IntelligentTiering
  - S3:ObjectTagging:Put
  - S3:ObjectTagging:Delete
  - S3:ObjectAcl:Put
- Object name filtering possible (eg. *.jpg)
- REMEMBER: S3 event notifications typically deliver events in seconds but can sometimes take over a minute
- S3 bucket must have permission to perform actions with the desired service
  - SNS: SNS:Publish
  - SQS: SQS:SendMessage
  - Lambda: lambda:InvokeFunction
- Connecting with Amazon EventBridge:
  - Advanced filtering options with JSON rules (metadata, object size, name...)
  - Multiple Destinations - Can bridge to over 18 AWS services, including Step Functions, Kinesis Streams/Firehose...
  - EventBridge Capabilities - Archive, Replay Events, Reliable Delivery
S3 Select & Glacier Select
- Retrieve less data using SQL by performing server-side filtering
- Can filter by rows & columns (simple SQL statements)
- Less network transfer, less CPU cost client-side
S3 Batch Operations
- Perform bulk operations on existing S3 Objects with a single request, such as:
  - Modifying object metadata & properties
  - Copy objects between S3 buckets
  - Encrypt un-encrypted objects
  - Modify ACLs, tags
  - Restore objects from S3 Glacier
  - Invoke Lambda function to perform custom action on each object
- A job consists of a list of objects, the action to perform and optional parameters
- S3 Batch Operations manages, retries, tracks progress, sends completion notifications, generates reports, etc...
- You can use S3 Inventory to get object list and use S3 Select to filter your objects

Performance

Baseline Performance
- S3 automatically scales to high request rates with a latency of 100-200ms
- Your application can achieve at least 3500 PUT/COPY/POST/DELETE or 5500 GET/HEAD requests per second per prefix in a bucket
- There are no limits to the number of prefixes in a bucket
Multi-part Uploads:
- Recommended for files > 100MB, must use for files > 5GB
- Can help parallelize uploads
S3 Transfer Acceleration
- Increase transfer speed by transferring file to an AWS edge location which will forward the data to the S3 bucket in the target region
- Compatible with multi-part upload
S3 Byte-Range Fetches
- Parallelize GETs by requesting specific byte ranges
- Better resilience in case of failures
- Can be used to speed up downloads

Storage Lens

Storage Lens can help you:
- Understand, analyze, optimize storage across an entire AWS Organization
- Discover anomalies, identify cost efficiencies, apply data protection best practices across entire AWS Organization
- Aggregate data for Organization, specific accounts, regions, buckets or prefixes and view via a customizable dashboard
Storage Lens can be configured to export metrics daily to an S3 bucket (CSV, Parquet)
Storage Lens Default Dashboard:
- Visualize summarized insights and trends for both free and advanced metrics
- Shows multi-region and multi-account data
- Pre-configured by Amazon S3
Storage Lens Metrics
- Free Metrics:
  - Summary - General insights about your S3 storage
    - StorageBytes, ObjectCount...
    - Use cases: identify the fastest-growing or not used buckets and prefixes
  - Cost-Optimization - Insights about how to manage and optimize your storage costs
    - NonCurrentVersionStorageBytes, IncompleteMultipartUploadStorageBytes...
    - Use cases: identify buckets with incomplete multipart uploaded older than X days, identify which objects could be transitioned to lower-cost storage class
  - Data-Protection - Insights for data protection features
    - VersioningEnabledBucketCount, MFADeleteEnabledBucketCount, SSEKMSEnabledBucketCount, CrossRegionReplicationRuleCount
    - Use cases: Identify buckets that don't follow data-protection best practices
  - Access-Management - Insights for S3 Object Ownership
    - ObjectOwnershipBucketOwnerEnforcedBucketCount...
    - Use cases: identify which Object Ownership settings your buckets use
  - Event - Insights for S3 Event Notifications
    - EventNotificationEnabledBucketCount (identify which buckets have Event Notifications configured)
  - Performance - Provide insights for S3 Transfer Acceleration
    - TransferAccelerationEnabledBucketCount (identify which buckets have Transfer Acceleration enabled)
  - And more! (about 28 free insights in total)
  - Data is available for queries for 14 days
- Advanced Metrics:
  - Activity - Provide insights about how your storage is requested
    - AllRequests, GetRequests, PutRequests, ListRequests, BytesDownloaded...
  - Status Code - Provide insights for HTTP status codes
    - 200OKStatusCount, 403ForbiddenErrorCount, 404NotFoundErrorCount...
  - Advanced Cost Optimization
  - Advanced Data Protection
Selecting Advanced metrics and recommendations for Storage Lens will grant you access to Advanced metrics as well as:
- CloudWatch Publishing, allowing you to access these metrics in CloudWatch without additional charges
- Prefix Aggregation - Collect metrics at the prefix level
- Making data available for queries for 15 months

S3 Security

S3 Encryption

There are 4 methods to encrypt S3 objects:
- Server-Side Encryption (SSE) methods:
  - SSE with Amazon S3-Managed Keys (SSE-S3) - Enabled by default for new buckets and new objects
    - Encryption type is AES-256
    - Must set header "x-amz-server-side-encryption": "AES256" when uploading
  - SSE with KMS Keys stored in AWS KMS (SSE-KMS) - Leverage KMS to gain more control over encryption process and to audit key usage using CloudTrail
    - Must set header "x-amz-server-side-encryption": "aws:kms" when uploading
    - ⚠REMEMBER⚠: You may be impacted by the KMS limits
      - When you upload it calls the GenerateDataKey KMS API
      - When you download, it calls the Decrypt KMS API
      - Both calls count toward the KMS quota per second (5500, 10000, 30000 req/s depending on your region)
      - You may request a quota increase using the Service Quotas console
  - SSE with Customer-Provided Keys (SSE-C) - For when you want to manage your own encryption keys
    - HTTPS must be used
    - Encryption key must be provided in request headers for every request made
- Client-Side Encryption
  - Use client libraries such as Amazon S3 Client-Side Encryption Library
  - Clients must encrypt data themselves before sending to Amazon S3
  - Customer fully manages the keys and encryption cycle
Encryption in Transit (SSL/TLS)
- Amazon S3 exposes two endpoints: HTTP and HTTPS
- You can force encryption in transit with bucket policies:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}

You can also force certain server-side encryption types using bucket policies:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "StringNotEquals": {
          "s3:x-amz-server-side-encryption": "aws:kms" // refuse PUT calls that don't use SSE-KMS
        },
        "Null": {
          "s3:x-amz-server-side-encryption-customer-algorithm": "true" // refuse PUT calls that don't use SSE-C
        }
      }
    }
  ]
}

🛈Note: Bucket Policies are evaluated before "Default Encryption"

S3 CORS

Cross-Origin Resource Sharing (CORS) -
- Origin = Scheme (protocol) + host (domain) + port
- Web Browser-based mechanism to allow requests to other origins while visiting the main origin
- Cross-origin requests won't be fulfilled unless the other origin allows for the request using CORS Headers (eg. Access-Control-Allow-Origin)
Example:
- http://example.com/app1 & http://example.com/app2 have the same origin
- http://www.example.com & http://other.example.com have different origins
If a client makes a cross-origin request on our S3 bucket, we need to enable to correct CORS headers

S3 MFA

If you enable MFA Delete on a bucket, MFA will be required to:
- Permanently delete an object version
- Suspend versioning on the bucket
MFA will not be required to:
- Enable Versioning
- List deleted versions
To use MFA Delete, Versioning must be enabled on the bucket
Only the bucket owner (root account) can enable/disable MFA Delete

S3 Access Logs

For audit purposes, you may want to log all access to S3 buckets.
Any request made to S3 from any account (authorized or not) will be logged into another S3 buckets
You can feed S3 log data into data analysis tools
The target logging bucket must be in the same AWS regions
Click Here for more info on the log format
⚠REMEMBER⚠: Don't set your logging bucket to be the monitored bucket! It will create a logging loop and your bucket will grow exponentially!

S3 Pre-signed URLS

You can generate pre-signed URLs using the S3 console, AWS CLI or SDK
URL Expiration:
- via S3 console - 1-720 minutes
- via AWS CLI - configure expiration with --expires-in parameter in seconds. Default = 3600s, Max = 604800s or around 168 hours
Users given a pre-signed URL inherit the permissions of the user that generated the URL for GET/PUT

S3 Locks

S3 Glacier Vault Lock
- Adopt a WORM (Write Once Read Many) model
- Create a Vault Lock Policy
- Lock the policy for future edits (can no longer be changed or deleted)
- Helpful for compliance and data retention
S3 Object Lock
- Adopt a WORM (Write Once Read Many) model
- Block an object version deletion for a specified amount of time
- Retention Modes:
  - Compliance - Object versions can't be overwritten or deleted by any user (root included), object retention modes can't be changed and retention periods can't be shortened
  - Governance - Only users with special permissions can overwrite or delete an object version or alter its lock settings.
- Retention Period - Protect the object for a fixed period, it can be extended
- Legal Hold - Protect the object indefinitely, independent from retention period.
  - Can be freely placed and removed using the s3:PutObjectLegalHold IAM permission

Access Points

Access Points simplify security management for S3 Buckets
- Each access point has:
  - Its own DNS name (Internet Origin or VPC Origin)
  - An access point policy (similar to bucket policy) - manage security at scale
- We can def ine the access point to be accessible only from within the VPC
- You must create a VPC Endpoint (Gateway or Interface) to access the Access Point
- The VPC Endpoint Policy must allow access to the target bucket and Access Point

S3 Object Lambda

Use AWS Lambda Functions to change the object before it is retrieved by the caller application
Only one S3 bucket is needed, on top of which we create S3 Access Point and S3 Object Lambda Access Points
Use Cases:
- Redacting personally identifiable information for analytics or non-prod environments
- Converting across data formats (eg converting XML to JSON)
- Resizing and watermarking images on the fly using caller-specific details such as the user who requested the object