AWS Cloud Practitioner
Table of Content
- Section 3: What is Cloud Computing?
- Section 4: IAM - Identity and Access Management
- Section 5: EC2 - Elastic Compute Cloud
- Section 6: EC2 Instance Storage
- Section 7: ELB & ASG - Elastic Load Balancing & Auto Scaling Groups
- Section 8: Amazon S3
- Section 9: Databases & Analytics
- Section 10: Other Compute Service: ECS, Lambda, Batch, Lightsail
- Section 11: Deployments & Managing Infrastructure at Scale
- Section 12: Leveraging the AWS Global Infrastructure
- Section 13: Cloud Integrations
- Section 14: Cloud Monitoring
- Section 15: VPC & Networking
- Section 16: Security & Compliance
- Section 17: Machine Learning
- Section 18: Account Management, Billing & Support
- Section 19: Advanced Identity
- Section 20: Other Services
- Section 21: AWS Architecting, Ecosystem
Section 3: What is Cloud Computing?
Traditional IT Overview
Both clients and Servers have IP addresses. This allows for separate sources to interact with each other. The course uses the example of a post office, with the network between the client and the server acting as the post office, and the IP address being the physical address written on the letter to send.
What is in a server?
- CPU: For computation
- RAM: Memory
- Storage: Of Data
- Database: Store data in a structured way
- Network: Routers, switch, DNS server
IT Terminology
- Network: Cables, routers, and server connected with each other
- Router: A networking device that forwards data packets between computer networks. They know where to send your packets on the internet
- Switch: Takes a packet and sends it to the correct server/client on your network
Traditional Infrastructure Building:
Startups/small business would use to build infrastructure completely on site. Sometimes from their own house/garage. Then as you grow, it typically transitions to keeping the office space and server space at two completely different locations (at a data center perhaps).
Problems with the traditional IT approach:
- Pay for the rent for the data center
- Pay for power supply, cooling and maintenance
- Adding and replacing hardware takes time
- Scaling is limited
- Hire 24/7 eam to monitor the infrastructure
- Unsure how to deal with disasters
- The question becomes: Can we externalize this?
AWS Cloud Overview
What is Cloud Computing?
- The on-demand delivery of compute power, database storage, applications, and other IT resources
- Through cloud services, you get pay-as-you-go pricing (only pay for what you actually use)
- You can provision a perfect type/size of computing resources as you need
- Instant access to all resources you need
- Simple interface, easy to access servers, storage, databases and a set of application services
- AWS owns and maintains the hardware required for these services, whilst you provision and use what you need via a web-application
The Deployment Models of the Cloud
Private Cloud:
- Cloud services used by a single organisation, not exposed to the public
- Complete control
- Security for sensitive applications
- Meet specific business needs
Public Cloud:
- Cloud resources owned and operated by a third-party cloud service provider delivered over the internet
- Six advantages of Cloud Computing
Hybrid Cloud:
- Keep some servers on premises and extend some capabilities to the Cloud
- Control over sensitive assets in your private infrastructure
- Flexibility and cost-effectiveness of the public cloud
The Five Characteristics of Cloud Computing
- On-demand Self Service:
- Users can provision resources and use them without human interaction from the service provider
- Broad Network Access:
- Resources are available over the network, and can be access by diverse client platforms
- Multi-tenancy and resource pooling:
- Multiple customers can share the same infrastructure and applications with security and privacy
- The same physical resources service multiple concurrent customers
- Rapid Elasticity and scalability:
- Automatically and quickly acquire and dispose of resources as needed
- Quickly and easily scale based on demand
- Measured service:
- All usage is measured, pay for what you use, can set up budget alerts
Six Advantages of Cloud Computing:
- Trade capital expense (CAPEX) for operation expense (OPEX)
- Pay on demand: Don't own hardware
- Reduced Total Cost of Ownership (TCO) and Operational Expense (OPEX)
- Benefit from massive economies of scale
- Prices are reduced as AWS is more efficient due to large scale
- Stop guessing capacity
- Scale based on actual measured usage
- Increase speed and agility
- Stop spending money running and maintaining data centers
- Go global in minutes
- Leverage the AWS global infrastructure
What Problems Does this Solve?
- Flexibility: Change resource types when needed
- Cost-effectiveness: Pay as you go, for what you use
- Scalability: Accommodate larger loads by making hardware stronger or adding additional nodes
- Elasticity: Ability to scale out and scale-in when needed
- High-availability and Fault-tolerance: Build across data centers
- Agility: Rapidly develop, test and launch software applications
Different Types of Cloud Computing
Infrastructure as a Service (IaaS)
- Provide building blocks for cloud IT
- Provides networking, computers, data storage space
- Highest level of flexibility
- Easy parallel with traditional on-premises IT
- Example: Amazon EC2
Platform as a Service (PaaS)
- Removes the need for your organisation to manage the underlying infrastructure
- Focus on the deployment and management of your applications
- Example: Elastic Beanstalk
Software as a Service (SaaS)
- Completed product that is run and managed by the service provider
- Example: Machine Learning, Rekognition
What do you have to manage as a user
On-premise (Everything is managed by you):
- Applications
- Data
- Runtime
- Middleware
- O/S
- Virtualisation
- Servers
- Storage
- Networking
IaaS:
- Applications
- Data
- Runtime
- Middleware
- O/S
PaaS:
- Applications
- Data
Saas:
- Everything is managed by others
Pricing of the Cloud - Quick Overview
- AWS has 3 pricing fundamentals, following the pay-as-you-go pricing model
- Compute
- Pay for compute time
- Storage
- Pay for data stored in the cloud
- Data transfer OUT of the cloud
- Data transfer IN is free
AWS Cloud Overview
AWS Cloud Use Cases
- Enables you to build sophisticated, scalable applications
- Applicable to a diverse set of industries
- Use cases include
- Enterprise IT, Backup and storage, big data analytics
- Website hosting, mobile and social apps
- Gaming
AWS Regions
- Regions all around the world
- Following the naming convention us-east-1, eu-west-3 etc.
- A region is a cluster of data centers
- Most AWS services are region-scoped
When choosing a region you need to consider: the proximity to customers (to reduce latency), availability of services within a region, and the pricing (which can vary from region to region). There may also be some compliance with regard to data governance (data may not be able to leave a region with explicit permission).
AWS Availability Zones
- Each region has availability zones (between 3 and 6 always). Examples: ap-southeast-2a, ap-southeast-2b, ap-southeast-2c
- Each availability zone (AZ) is one or more discrete data centers with redundant power, networking and connectivity
- They are all separate, and therefore isolated from disasters
- Connected with high bandwidth, ultra-low latency networking
Points of Presence (Edge Locations)
- There are 400+ Points of Presence in 90+ cities, across 40+ countries
- Content is delivered to end users with lower latency
Section 4: IAM - Identity and Access Management
IAM Introduction: Users, Groups, Policies
- IAM: Identity and Access Management: Global service
- Root account: Created by default (should not be used or shared)
- Users are people within your organisation, and can be grouped
We might have a 'developers' group, a completely separate 'operations' group, then an 'Audit' group, which has members the two previously mentioned separate groups. There groups will all have different permissions based on what level of access they require.
IAM: Permissions
- Users or groups can be assigned JSON documents, these are known as policies
- These policies define the permissions of the users
- We want to use the LPP (least privilege principle), and not give a user more permissions than they require
IAM Policies Structure
Consists of: - Version: Policy language version, always include '2012-10-17' - ID: An identifier for the policy (optional) - Statement: One or more individual statements (required) Statements Consists of: - Sid: An identifier for the statement (optional) - Effect: Whether the statement allows or denies access (Allow/Deny) - Principal: Account/user/role to which this policy applies to - Action: List of actions this policy allows or denies - Resource: A list of resources to which the actions applies to - Condition: Conditions for when this policy is in effect (optional)
An Example Policy Statement:
{
"Version": "2012-10-17",
"Id": "Se-Account-Permissions",
"Statement": [
{
"Sid": "1",
"Effect": "Allow",
"Principal": {
"AWS": ["arn:aws:iam::123456789012:root"]
},
"Action": ["s3:GetObject", "s3:PutObjecy"],
"Resource": ["arn:aws:s3::mybucket/*"]
}
]
}
IAM - Password Policy
- Strong passwords obviously imply a higher account security
- AWS allows you to setup a password policy
- Minimum password length
- Require specific character types
- Upper/lowercase letters
- Numbers
- Non-alphanumeric characters
- Allow IAM users to change their passwords
- Require passwords to be changed after a period of time
- Prevent password re-use
Multi-Factor-Authentication - MFA
- Important to protect Root Accounts and IAM users
- MFA = password only you know + security device you own
- Main benefit: If a password is stolen/hacked, the account is not compromised
- You can use an authenticator on your phone, or a physical device/key which provides the authentication
How Can Users Access AWS
Three Main Options:
- The AWS management console (protect by password + MFA)
- AWS Command Line Interface (CLI): Protected by access keys
- AWS Software Development Kit (SDK) - for code: Protected by access keys
Access keys are generated through the AWS console. Users manage their own access keys, and are secret like a password. The access Key ID ~= username, secret access key ~= password.
AWS CLI
- A tool to interact with AWS services using the command-line shell
- Direct access to public APIs of AWS services
- Scripts can be made to manage resources
- Open-source
- Alternative to using the AWS management console
AWS SDK
- AWS software development kit
- Language-specific APIs
- Enables you to access and manage AWS services programmatically
- Embedded within your application
- Supports many different programming languages, mobile SDK's and IoT Device SDK's
IAM Roles for Services
- Some AWS services will need to perform actions on your behalf
- We assign permissions to AWS services which have their own IAM roles
- Common roles:
- EC2 Instance Roles
- Lambda Function Roles
- Roles for CloudFormation
IAM Security Tools
IAM Credentials Reports (Account-level):
- A report which lists your account's users and the status of their credentials
IAM Access Advisor (User-level):
- Shows the service permissions granted to a user and when they were last accessed
- This info can be used to revise/update your policies
IAM Guidelines and Best Practices
- Don't use the root account except for AWS account setup
- One physical user = one AWS user
- Assign users to groups and assign permissions to groups
- Create a strong password policy
- Use and enforce MFA
- Create and use Roles for AWS services permissions
- Use access keys for programmatic access
- Audit permissions of your account using IAM credentials report and IAM access advisor
- Never share IAM users and access keys
Shared Responsibility Model for IAM
AWS:
- Infrastructure (global network security)
- Configuration and vulnerability analysis
- Compliance validation
You:
- Users, groups, roles, policies management and monitoring
- Enabling MFA on all accounts
- Rotation of keys
- Using IAM tools to apply appropriate permissions
- Analyse access patterns and review permissions
Section 5: EC2 - Elastic Compute Cloud
Amazon EC2
- Elastic compute cloud (IaaS)
- Mainly consists of being able to:
- Rent virtual machines (EC2)
- Store data on virtual drives (EBS)
- Distribute load across machines (ELB)
- Scale the services using an auto-scaling group (ASG)
Sizing/Configuration Options
- Operating system: Linux/Windows/MacOS
- Compute power: CPU
- Amount of RAM
- Storage space
- Network-attached (EBS and EFS)
- Hardware (EC2 Instance Store)
- Network card: Speed of the card, Public IP address
- Firewall rules: Security group
- Bootstrap script (configure at first launch): EC2 User Data
EC2 User Data
- Possible to bootstrap instances using an EC2 User data script
- Bootstrapping: Launching commands when a machine starts (runs just once)
- EC2 User data is used to automate boot tasks such as:
- Installing updates
- Installing software
- Downloading common files from the internet
- Runs with the root user
EC2 Instance Types - Overview
You can use different types of EC2 instances that are optimised for different use cases. It follows a naming convention.
m5.2xlarge
- m: instance class
- 5: generation (AWS improves them over time)
- 2xlarge: size within the instance class
EC2 Instance Types - General Purpose
- Great for a diversity of workloads such as web servers or code repositories
- Balance between:
- Compute
- Memory
- Networking
- As used in the course t2.mirco is a General Purpose EC2 instance
EC2 Instance Type - Compute Optimised
Great for compute-intensive tasks that require high performance processors:
- Batch processing workloads
- Media transcoding
- High performance web servers
- High performance computing (HPC)
- Scientific modelling and machine learning
- Dedicated gaming servers
EC2 Instance Types - Memory Optimised
- Fast performance for workloads that process large data sets in memory
- Use Cases:
- High performance, relational/non-relational databases
- Distributed web scale cache stores
- In-memory databases optimised for business intelligence
- Applications performing real-time processing of big unstructured data
EC2 Instance Types - Storage Optimised
- Great for storage-intensive tasks that require high, sequential read and write access to large data sets on local storage
- Use cases:
- High frequency online transaction processing (OLTP) systems
- Relational and NoSQL databases
- Cache for in-memory databases
- Data warehousing applications
- Distributed file systems
Introduction to Security Groups
- Fundamental of AWS network security
- They control how traffic is allowed into out of our EC2 instances
- Security groups only contain allow rules
- The rules can reference by IP or by security group
Deeper Dive
- Security groups acts as a 'firewall' on EC2 instances
- They regulate:
- Access to ports
- Authorised IP ranges - IPv4 and IPv6
- Control of inbound network (from other to instance)
- Control of outbound network (from the instance to other)
Security Groups - Good to Know
- Can be attached to multiple instances
- Locked down to a region/VPC combination
- Does live 'outside' of the EC2 - EC2 won't see/know about blocked traffic
- Good to maintain one separate security group for SSH access
- If your application is not accessible (time out), then it's a security group issue
- If you receive connection refused, then it's an application error
- Inbound traffic is blocked by default and outbound is authorised by default
Classic Ports to Know
- 22: SSH (Secure Shell)
- 21: FTP (File transfer protocol)
- 22: SFTP (Secure file transfer protocol)
- 80: HTTP (Unsecured websites)
- 443: HTTPS (Secured websites)
- 3389: RDP (Remote Desktop Protocol)
Connecting Via SSH
- Using our [filename].pem file we generated when spinning up an instance, we can SSH into the machine.
- First change the permissions of the file using chmod 0400 [filename]
- Then use the command: ssh -i [KeyFileName] ec2-user@[IPv4 address]
EC2 Instance Connect
- You can also connect to your EC2 instance within the browser
- It does not require the key file which was downloaded, instead uses a 'temporary key'
- Works only with some instance types?
- Port 22 must still be open
EC2 Instances Purchasing Options
- On-demand Instances: Short workload, predictable pricing, pay by second
- Reserved (1 & 3 years):
- Reserved instances: Long workloads
- Convertible Reserved Instances: Long workloads with flexible instances
- Savings Plans (1 & 3 years): Commitment to an amount of usage, long workload
- Spot Instances: Short workloads, cheap, can lose instances (less reliable)
- Dedicated Hosts: Book an entire physical server, control instance placement
- Capacity Reservations: Reserve capacity in a specific AZ for any duration
On-Demand
- Pay for what you use
- Linux/Windows - Billing per second
- Has the highest cost but no upfront payment
- No long term commitment
Reserved Instances
- Cheaper than on-demand?
- Reserve specific instance attributes
- 1 year or 3 years (bigger discount)
- More discounts for upfront payments
- Recommended for steady-state usage applications
Savings Plans
- Get a discount based on long-term usage
- Commit to a certain type of usage ($10/hour for 1 or 3 years for example)
- Locked to a specific instance family and AWS region
Spot Instances
- Major discounts
- You can 'lose' them at any point if your max price is less than the current spot price
- The MOSt cost-efficient instances in AWS
- Useful for:
- Batch jobs
- Data analysis
- Image processing
- Distributed Workloads
- Workloads with a flexible start/end time
- NOT suitable for critical jobs or databases
Dedicated Hosts
- A physical server with EC2 instance capacity fully dedicated to your use
- Allows you to address compliance requirements and use your existing server-bound software licenses
- Purchasing Options
- On-demand: Pay per second for active Dedicated host
- Reserved: 1 or 3 years (No upfront, partial, all upfront)
- Most expensive
- Useful for software with a complicated licensing model, or companies with strong regulatory/compliance needs
Dedicated Instances
- Instances run on hardware dedicated to you
- May share hardware with other instances in the same account
- No control over instance placement (can move hardware after Stop/Start)
Capacity Reservations
- Reserve on-demand instances capacity in a specific AZ for any duration
- Always have access to EC2 when you need it
- No time commitment (create/cancel at any time)
- Combine with regional reserved instances and savings plans to benefit from billing discounts
- Charged on-demand rate
- Suitable for short-term, uninterrupted workloads which require a specific AZ
Shared Responsibility for EC2
AWS:
- Infrastructure (global network security)
- Isolation on physical hosts
- Replacing faulty hardware
- Compliance validation
User:
- Security group rules
- Operating-system patches and updates
- Software and utilities installed on the EC2 instance
- IAM Roles assigned to EC2 and IAM user access management
- Data security on your instance
Section 6: EC2 Instance Storage
What's an EBS Volume?
- An EBS (elastic block store) volume is a network drive you can attach to your instances while they run
- It allows your instances to persist data (even after termination)
- However, they can only be mounted to one instance at a time (at the CCP level)
- Bound to a specific AZ (availability zone)
- Think of them as a virtual USB
EBS Volume
- Network drive (not physical)
- Uses the network to communicate, some latency
- Detached/re-attached quickly
- Availability zone locked
- To move a volume across, you need to snapshot it
- Have a provisioned capacity (size in GBs, and IOPS)
- Get billed for the capacity
- Can increase capacity over time
EBS - Delete on Termination Attribute
- By default the root EBS volume is terminated alongside the EC2 instance
- By default, other attached volumes are not deleted
- This can be controlled by the AWS console/CLI
EBS Snapshots
- Make a backup (snapshot) of EBS volume at a point in time
- Do not need to detach first (but recommended)
- Can copy snapshots across AZ/Region
EBS Snapshot Features
- EBS Snapshot Archive
- Moves a snapshot to 'archive tier' (75% cheaper)
- Takes 24-72 hours to restore from this tier
- Recycle Bin for EBS Snapshots
- Setup rules to retain deleted snapshots (can recover from accidental deletion)
- Specify retention (1 week, 1 year etc)
AMI Overview
- AMI = Amazon Machine Image
- They are a customisation of an EC2 Instance
- Can add your own software, config, OS, monitoring etc.
- Faster boot/configuration time, software is pre-packaged
- Built for a specific region (can be copied across)
- Can launch EC2 instances from:
- A public AMI: AWS provided
- Your own AMI: Make and maintain them yourself
- AWS Marketplace AMI: An AMI made by someone else (potentially sold)
AMI Process (From an EC2 Instance)
- Start an EC2 instance and customise it
- Stop the instance (for data integrity)
- Build an AMI - This will create EBS snapshots
- Launch instances from other AMIs
EC2 Image Builder
- Used to automate creation of Virtual Machines/Container images
- We can automate the creation, maintenance, validation and testing of EC2 AMIs
- Can be run on a schedule (weekley, when packages update ect)
- Fre service, pay only for what resources the EC2 image builder uses
EC2 Instance Store
- EBS volumes are network drives with good but 'limited' performance
- If we require high-performance hardware, we should use EC2 instance store
- It has better I/O performance
- If stopped, the instance store will lose their data
- Good for buffer / cache / scratch data / temporary content
- Risk of data loss if hardware fails
- Backups/replications are your responsibility
EFS - Elastic File System
- Managed NFS (network file system) which can be mounded on 100's of EC2 instances
- Works with Linux EC2, and multiple availability zones
- Highly available, scalable, but expensive (3 times gp2), pay per use, no capacity planning
EFS - Infrequent Access (EFS-IA)
- Storage class which is cost-optimised for files not commonly accessed
- Up to 92% lower costs
- EFS will automatically move your files to EFS-IA based on the last time they were accessed
- Can enable EFS-IA with a lifecycle policy
- Transparent to the applications accessing EFS
EC2 Storage - Shared Responsibility
AWS:
- Infrastructure
- Replication for data for EBS volumes and EFS drives
- Replacing faulty hardware
- Ensure their employees cannot access your data
User:
- Setting backup/snapshot procedures
- Setting up data encryption
- Responsibility of any data on the drives
- Understand the risk of using EC2 instance store
Amazon FSx - Overview
- Launch 3rd party high-performance file systems on AWS
- Fully managed service
Amazon FSx for Windows File Server
- Fully managed, reliable, scalable windows native shared file system
- Build on the Windows File Server
- Supports SMB protocol and Windows NTFS
- Integrated with Microsoft Active Directory
Amazon FSx for Lustre
- Fully managed, high-performance, scalable file storage for High Performance Computing (HPC)
- Lustre is derived from 'Linux' and 'Cluster'
- Machine learning, Analytics, Video Processing, Financial Modelling...
- Scales up to 100's of GB/s, millions of IOPS, sub-ms latencies
Section 7: ELB & ASG - Elastic Load Balancing & Auto Scaling Groups
Scalability & High Availability
- Scalability means that an application/system can handle greater loads by adapting
- There are two kinds of scalability:
- Vertical
- Horizontal (= elasticity)
- Scalability is linked but different to High Availability
Vertical Scalability
- Means increasing the size of the instance
- For example:
- Your application runs on a t2.micro
- Scaling that application vertically would mean running it on a t2.large
- Vertically scalability is common for non-distributed systems, such as a database
- Usually a limit to how much you can scale
Horizontal Scalability
- Increasing the number of instances/systems for your application
- Implies distributed systems
- Common for web applications/modern applications
- Easy to scale in the cloud
High Availability
- Usually goes hand-in-hand with horizontal scalability
- Typically means running your application in at least 2 availability zones
For EC2
Vertical (Instance size)
- From t2.nano - 0.5G RAM, 1 vCPU
- To u-12tb1.metal - 12.3G RAM, 448 vCPU's
Horizontal (Number of instances)
- Auto scaling group
- Load balancer
High Availability (Instances for same app across multiple AZ's)
- Auto scaling group multi AZ
- Load balancer multi AZ
Load Balancing
- Servers which forward internet traffic to multiple servers (EC2 instances) downstream
- Acts as a midpoint between an EC2 instance/s and some user/s
- Can spread the load across multiple downstream instances
- Expose just a single point of access (DNS) to your application
- Do regular instance health checks
- Provide SSL termination (HTTPS) for your websites
- High availability across zones
ELB - Elastic Load Balancer
-
Managed load balancer
- AWS guarantees it will be working
- AWS takes care of upgrades, maintenance, availability
- AWS provides only a few configuration 'knobs'
-
Costs less to setup your own load balancer, but will be more effort
-
4 Kinds offered by AWS
- Application Load Balancer (HTTP/HTTPS only) - Layer 7
- Static DNS
- gRPC
- Network Load Balancer (ultra-higher performance, TCP) - Layer 4
- UDP
- High performance
- Static IP
- Gateway Load Balancer - Layer 3
- GENEVE on IP Packets
- Route traffic to Firewalls managed on EC2 instances
- Intrusion detection
- Classic Load Balancer (Retired in 2023) - Layer 4 & 7
- Application Load Balancer (HTTP/HTTPS only) - Layer 7
Auto Scaling Group
- The load on your website/application can and will change
- With the cloud, you can create/delete servers quickly
- The Auto Scaling Group (ASG) goal is to:
- Scale out (add EC2 instances) to match increased load
- Scale in (remove EC2 instances) to match decreased load
- Ensure we have min/max number of machines running
- Register new instances on load balancer automatically
- Replace unhealthy instances
- Cost savings: Only run at optimal capacity
Auto Scaling Groups - Scaling Strategies
- Manual Scaling: Update the size of an ASG manually
- Dynamic:
- Simple/Step
- When CloudWatch is triggered (example CPU > 70%), then add 2 units
- Same for removing units based on a metric
- Target Tracking Scaling
- Example: Wanting the average ASG CPU usage to stay at ~40%
- Scheduled ScalingEvent
- Anticipate scaling based on known usage pattern
- Example: Betting app would want increased capacity before big matches
- Simple/Step
- Predictive Scaling:
- Uses Machine Learning to predict future traffic
- Automatically provision instances to serve the period
- Useful for predictable time-based patterns
Section 8: Amazon S3
Use Cases
- Backup and storage
- Disaster Recovery
- Archive
- Hybrid Cloud Storage
- Application Hosting
- Media Hosting
- Data lakes & big data analytics
- Software Delivery
- Static Website
Amazon S3 - Buckets
- Allows people to store objects (files) in 'buckets' (directories)
- Must have a globally unique name (across both regions and accounts)
- Defined at the region level
- Looks like a global service, but is not
- Naming Convention:
- No uppercase or underscore
- 3-63 characters
- Not an IP
- Start with lowercase letter or a number
- Cannot start with xn-- prefix
- Cannot end with -s3alias suffix
Amazon S3 - Objects
- Objects (files) have a key
- The key is the full path
- s3://my-bucket/my_file.txt for example
- Composed of prefix + object name
- No concept of directories (although the UI will make it seem that way)
- Everything is just a key (containing slashes)
- Object values are the content of the body
- Max object size if 5TB (5000GB)
- When uploading more than 5GB, must use 'multi-part upload'
- Metadata (list of key/value pairs - system or user metadata)
- Tags (unicode/value pair - up to 10) - useful for security and lifecycle
- Version Id (if versioning is enabled)
Amazon S3 - Security
- User-based
- IAM Policies - Which API calls should be allowed for a user
- Resource-Based
- Bucket policies - bucket wide rules from the S3 console - allows cross account
- Object Access Control List (ACL) - Finer Grain (can be disabled)
- Bucket Access Control List (ACL) - Less common (can be disabled)
- Note: an IAM principal can access an S3 object if:
- The user IAM permissions ALLOW it OR the resource policy ALLOWS it
- AND there is no explicity DENY
S3 Bucket Policies
- JSON based policies
- Resources: Buckets and objects
- Effect: Allow/Deny
- Actions: Set of API to Allow or Deny
- Principal: Account/User to apply the policy to
- Use S3 Bucket for policy to:
- Grant public access to bucket
- Force object encryption
- Grant another account access
Example Policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicRead",
"Effect": "Allow",
"Principal": "*",
"Action": ["s3:GetObject"],
"Resource": ["arn:aws:s3:::examplebucket/*"]
}
]
}
Static Website Hosting
- S3 can host static websites
- The URL will depend on the region
- You need public read enabled for this to work, else you will get a 403 Forbidden error
Versioning
- Files can be versioned
- It is enabled at bucket level
- Same key overwrite will change the 'version': 1, 2, 3 etc
- Best practice to version buckets
- Protect against unintended deletes
- Easy to roll back to previous version
- Notes:
- A file not versioned prior to enabling versioning, will have a version of null
- Suspending versioning does not delete previous versions (safe operation)
Replication
CRR: Cross Region Replication SRR: Same Region Replication
- Must enable versioning in source and destination buckets
- Buckets can be in different AWS accounts
- Copying is async
- Must give proper IAM permissions to S3
Use Cases:
- CRR - Compliance, lower latency access, replication across accounts
- SRR - Log aggregation, live replication between prod and test accounts
Storage Classes
- Standard - General Purpose
- Standard-Infrequent Access (IA)
- One Zone-Infrequent Access
- Glacier Instant Retrieval
- Glacier Flexible Retrieval
- Glacier Deep Archive
- Intelligent Tiering
Durability and Availability
Durability:
- High Durability (99.99999999999, 11 9'1) of objects across multiple AZ
- With 10,000,000 objects, on average you can expect to lose a single object every 10,000 years Availability:
- Measures how readily available a service is
- Varies on storage class
- Example: S3 standard has 99.99% availability (not available for abut 53 minutes total a year)
Storage Classes
General Purpose
- 99.99% availability
- Frequently accessed data
- Low latency, high throughput
- Can sustain 2 concurrent facility failures
- Use cases: Big data analytics, mobile/gaming applications, content distribution
Infrequent Access
- Less frequently access (but still rapid access)
- Lower cost than S3 standard
- 99.9% availability
- Use casesL Disaster Recovery/Backup
One Zone Infrequent-Access
- 99.5% availability
- Use Cases: Storing secondary backup copies of on-premise data
Glacier Storage Class
- Low-cost object storage
- Pricing: Storage and object retrieval cost
Glacier Instant Retrieval
- Millisecond retrieval, great for data access once a quarter
- Minimum storage duration of 90 days
Glacier Flexible Retrieval
- Expedited (1-5 min), standard (3-5 hours), Build (5-12 hours) - free
- Minimum storage duration of 90 days
Glacier Deep Archive (Long Term)
- Standard (12 hours), Bulk (48 Hours)
- Minimum storage duration of 180 days
** Intelligent-Tiering**
- Small monthly monitoring and auto-tiering fee
- Moves objects automatically between access tiers based on usage
- No retrieval charges in S3 Intelligent-Tiering
Encryption
Server-Side Encryption (Default)
- When an object is uploaded to a bucket
- The server encrypts then stores the object
Client-Side Encryption
- The user encrypts the file before upload
Access Analyser For S3
- Ensure only intended people have S3 access
- For example: Showing which buckets are publicly accessible, shared with other accounts etc
- Evaluates Bucket policies, ACL's, Access point policies
AWS Snow Family
- Highly secure, portable device
- Used to collect/process data at the edge and migrate data into/out of AWS
Data Migration:
- Snowcone
- Snowball edge
- Snowmobile
Edge Computing:
- Snowcone
- Snowball edge
Data Migrations: Snow Family
Challenges:
- Limited connectivity
- Limited bandwidth
- High network cost
- Shared bandwidth (can't maximise the line)
Snow Family: Offline device to perform data migrations. AWS sends you a physical device for you to use to upload your data.
Snowball Edge
- Physical transport solution
- Alternative to moving data over the network
- Pay per data transfer job
- Provide S3-compatible block storage
- Storage/Compute optimised versions (varying capacity/transfer speeds)
Snowcone & Snowcone SSD
- Small, portable, rugged and secure
- Light (2.1kg)
- Used for edge computing, storage, and data transfer
- Must provide own battery/cables
- Can be sent back to AWS, or use AWS DataSync
Snowmobile
- An actual truck.
- Transfer exabytes of data (1 EB = 1,000PB = 1,000,000 TB's)
- Has 100PB of capacity
- High security: temperature controlled, GPS, video surveillance
Edge Computing
- Process data whilst being created on an edge location
- Truck on the road, ship at sea, etc
- These locations may have limited/zero internet access
AWS OpsHub
- Software downloaded to your computer to manage Snow Family Devices
- Unlock and configure device
- Transfer file
- Launch/manage instances
- Monitor Device metrics
SnowBall Edge Pricing
- Pay for device usage and data transfer OUT of AWS
- Data transfer IN to S3 is free
On-Demand:
- One-time server fee per job
- Shipping days are NOT counted towards the days of usage
- Pay per day for additional days
Committed Upfront:
- Pay in advance for monthly, 1-year, 3-year of usage
- Up to 62% discounted pricing
AWS Storage Gateway
- Bridge between on-premise data and cloud data in S3
- Hybrid storage service to allow on-premises to seamlessly use the AWS cloud
- Use cases: Disaster recovery, backup & restore, tiered storage
Section 9: Databases & Analytics
- Structure data
- Relations between datasets
- Indexes to efficiently query/search data
- Optimised for a purpose
Relational Database
- Like excel spreadsheets, with links between tables
- Can use the SQL language to perform queries/lookups
NoSQL Databases
- Non-relational databases
- Specific data models and have flexible schemas
- Benefits
- Flexibility: Easy to evolve data model
- Scalability: Designed to scale-out by using distributed clusters
- High-Performance
- Highly Functional Examples: Key-value, document, graph, in-memory, and search databases
- Often modelled in JSON format
Databases & Shared Responsibility on AWS
- AWS offers use to manage different databases
- Benefits include:
- Quick provisioning, high availability, vertical/horizontal scaling
- Automated backup & restore, operations, upgrades
- OS Patching handled by AWS
- Monitoring, alerting These database technologies could be run on EC2, but then the user would be responsible for all this
RDS Overview
- Relational Database Service
- Uses SQL
- Create databases in the cloud, managed by AWS
- Postgres
- MySQL
- MariaDB
- Oracle
- Microsoft SQL Server
- IBM DB2
- Aurora (AWS Proprietary database)
Advantages of Using RDS Versus DB on EC2
- Automated provisioning, OS patching
- Continuos backups and restore to specific timestamp
- Monitoring dashboards
- Read replicas for improved read performance
- Multi AZ setup for disaster recovery
- Maintenance windows for upgrades
- Scaling capability (both vertical and horizontal)
- Storage backed by EBS SSH is the only thing we cannot do.
Aurora
- AWS proprietary technology
- PostgreSQL and MySQL
- Cloud optimised (claims 5x performance improvement over MySQL RDS, and 3x over PostgreSQL on RDS)
- Storage automatically grows in 10GB increments
- Costs 20% more than RDS
- Not in the free tier
Aurora Serverless
- Automated DB instantiation and auto-scaling
- No capacity planning needed
- Lease management overhead
- Pay per second
- Good for infrequent, intermittent or unpredictable workloads
RDS Deployments
Read Replicas:
- Scale the read workload of your DB
- A 'copy' of your database is made, which your application can read from
- Can create up to 15 Read Replicas
- Writing data only happens to the main database
Multi-AZ:
- Failover in case of AZ outage (high availability)
- Failover is passive (cannot be read/written to) unless there is an AZ issue for main DB
- Can only have 1 other AZ as a failover
Multi-Region (Read Replicas)
- Allows applications to read from a DB that is within their region
- Data still only written to main DB
- Disaster recover in case of region issue is a benefit
- Local performance for global read
- Replication cost
ElastiCache Overview
- ElastiCache used to get managed Redis or Memcached DB
- In-memory database, with high performance and low latency
- Helps reduce load on DB's with read intensive workloads
DynamoDB
- Fully managed, highly available, replication across 3 AZ
- NoSQL DB
- Scales to massive workloads, distributed 'serverless' database
- Millions of requests per seconds, trillions of rows, 100's of TB's of storage
- Fast and consistent in performance
- Single-digit millisecond latency - low latency retrieval
- Integrated with IAM for security, authorisation and administration
- Low cost, and auto scaling capabilities
- Standard & infrequent access (iA) table class
- Key/value database
DynamoDB Accelerator (DAX)
- Fully managed in-memory cache for DynamoDB
- Only for DynamoDB
- 10x performance improvement (microsecond latency)
- Highly scalable/available
DynamoDB Global Tables
- Make a Table accessible with low latency in multiple-regions
- Can create another table in a different region, these two tables will have 2-way replication
- Active-Active replication (read/write to any AWS region)
Redshift Overview
- Based on PostgreSQL, but it's not used for OLTP
- It's OLAP - Online Analytical Processing (analytics and data warehousing)
- Load data once every hour, rather than every second
- 10x better performance than other data warehouses, scale to Petabytes of data
- Columnar storage (instead of row based)
- Massively parallel query execution (MPP), highly available
- Pay as you go, based on provisioned instances
- Has a SQL interface for queries
Redshift Serverless
- Automatically provisions and scales data warehouse underlying capacity
- Run analytics workloads without managing data warehouse infrastructure
- Pay for what you use
- Use Cases:
- Reporting
- Dashboarding applications
- Real-time analytics
- Pay only for compute and storage used during analysis
Amazon EMR
- Elastic MapReduce
- Helps to create 'Hadoop Clusters' (Big Data), to analyse and process vast amounts of data
- Clusters can be made of hundreds of EC2 instances
- EMR takes care of all the provisioning and configuration
- Auto-scaling and integrated with Spot instances
- Use Cases: Data processing, machine learning, web indexing, and big data
Athena
- Serverless query service to perform analytics against S3 objects
- SQL to query the files
- Supports CSV, JSON, ORC, Avro, and Parquet
- Pricing: $5 per TB of scanned data
- Use compressed or columnar data for cost-savings
- Use Cases: Business intelligence, analytics, reporting, analyse & query VPC flow logs, ELB logs, CloudTrails
QuickSight
- Serverless machine learning-powered business intelligence service to create interactive dashboards
- Scalable, fast, per-session pricing
- Use Cases: Business analytics, visualisations, ad-hoc analysis
DocumentDB
- A way to implement MongoDB (NoSQL database)
- Store/query/index JSON data
- Similar deployment concepts as Aurora
- Fully managed, highly available, replication across 3 AZ's
- Scales automatically
Neptune
- Managed graph database
- Example of a graph dataset is a social network
- Users have friends
- Posts have comments
- Comments have likes, etc
- Across 3 AZ, up to 15 read replicas
- Build and run applications working with highly connected datasets
- Can store billions of relations and keep the millisecond latency
- Great for knowledge graphs, fraud detection, recommendation engines, social networking
Timestream
- Fully managed, fast, scalable, serverless time series database
- Automatically scales both up and down based on capacity/compute
- 1000 times faster than relational DB, and 1/10th of the cost
- Built-in time series analytics functions
QLDB
- Quantum Ledger Database
- A ledger is a book recording financial transactions
- Fully managed, serverless, high availability, replication across 3 AZ
- Used to review history of all the changes made to your application data over time
- Immutable system: No entry can be removed/modified. It is cryptographically verifiable
- Difference with Amazon Managed Blockchain: No decentralisation component, in accordance with financial regulation rules
Managed Blockchain
- Makes it possible to have multiple parties execute transactions without the need for a trusted, central authority
- A managed service to:
- Join public blockchain networks
- Or create your own scalable private network
- Compatible with the frameworks Hyperledger Fabric & Ethereum
Glue
- Managed extract, transform, and load (ETL) service
- Useful to prepare/transform data for analytics
- Fully serverless service
- Glue Data Catalog
DMS
- Database Migration Service
- Quickly and securely migrate database to AWS
- Source database remains available during migration
- Supports
- Homogeneous Migrations (Same to same)
- Heterogeneous Migrations (Different DB)
Section 10: Other Compute Service: ECS, Lambda, Batch, Lightsail
What is Docker?
- Software development platform to deploy apps
- Apps get packaged into containers, these can be run on an OS
- Apps will run the same regardless of machine, OS, location etc
- Very predictable, easier to maintain and deploy
- Scale containers up/down very quickly
- Stored in Docker Repositories
- Public (With base docker images)
- Private (Amazon ECR - Elastic Container Registry)
Docker versus Virtual Machine
- Docker is 'sort of' virtualisation software
- Resources are shared with the host -> many containers on one server
ECS
- Elastic Container Service
- Launch docker containers on AWS
- You as the user must provision/maintain the infrastructure (EC2 instances)
- AWS will look after starting/stopping containers
- Has integrations with the Application Load Balancer
Fargate
- Launch docker containers on AWs
- No need to provision EC2 instances
- Serverless offering
- AWS will run the containers based on your CPU/RAM requirements
ECR
- Elastic Container Registry
- Private docker registry
- Where you store your docker images (so ECS or fargate can use them)
Serverless
- Where developers no longer have to manage servers
- They simply just deploy code (functions)
- Initially, serverless was FaaS (function as a service)
- It doesn't actually mean there are no servers, just that the user doesn't have to manage, use, or see them
AWS Lambda
- Virtual Functions - no servers
- Limited by time - short executions
- Run on-demand
- Scaling is automated
Lambda Benefits
- Easy Pricing
- Pay per request and compute time
- 1,000,000 requests and 400,000 GBS of compute time each month in the free tier
- After free, $0.20 per 1 million requests and $1.00 per 600,000 GB-seconds
- Integrated with the whole AWS suite
- Event driven: Only invoked when needed (Reactive)
- Easy monitoring through CloudWatch
- Wide programming language support
- Lambda container image
- The container image must implement the lambda runtime api
- ECS/Fargate is preferred for running arbitrary Docker images
API Gateway
- Example: Building a serverless API
- Expose API gateway to client, so they can communicate
- API Gateway will pass proxy requests to lambda, which can then execute and perform CRUD operations on Database
- Serverless and scalable
- Supports RESTful API;s and WebSocket APIs
- Security, user authentication, API throttling, API keys, and monitoring are all supported
AWS Batch
- Fully managed batch processing at any scale
- Can run 100,000's of computing batch jobs
- A 'batch' job is a job with a start and end (as opposed to continuous)
- Batch will dynamically launch EC2 instances or Spot instances
- You submit/schedule a batch job, and AWS does the rest
- Batch jobs are defined as Docker Images and run on ECS
- Helpful for cost optimisations
Batch Versus Lambda
Lambda
- Time limit
- Limited runtime
- Limited temporary disk space
- Serverless
Batch
- No time limit
- Any runtime if it's packaged as a Docker image
- Rely on EBS/instance store for disk space
- Relies on EC2
Lightsail
- Virtual servers, storage, databases and networking
- Low and predictable pricing
- Simpler alternative to all the previously covered services
- Great for people with little cloud experience
- Monitoring/Notifications of lightsail resources
Section 11: Deployments & Managing Infrastructure at Scale
CloudFormation
- A declarative way of outlining AWS infrastructure
- CloudFormation create the infrastructure in the way your specified, with the exact configuration
Benefits:
- Infrastructure as code
- No manually created resources
- Changed to infrastructure are reviewed through code
- Cost
- Resources are tagged with an identifier
- Can easily estimate costs of resources
- Productivity
- Ability to destroy/create on the fly
- Automated generation of Diagram for your templates
- Declarative programming
- Infrastructure as code
CDK (Cloud Development Kit) Overview
- Define your cloud infrastructure using a familiar language
- The code is 'compiled' into a CloudFormation template
- Therefore, you can deploy infrastructure and application runtime code together
- Greater for Docker containers in ECS/EKS and Lambda Functions
Beanstalk
- Developer centric view of deploying applications on AWS
- Uses all the same components we have seen before
- Still maintain control over the config
- Beanstalk is a PaaS (Platform as a Service)
- Managed Service
- Instance config/OS
- Deployment strategy
- Capacity provisioning
- Load balancing/auto scaling
- Application health and monitoring
- Only the application is the responsibility of the developer
- Support for many platforms
- Health agents push metrics to CloudWatch
CodeDeploy
- When we want to deploy our application automatically
- Works with EC2 instances
- Works with On-Premises servers
- Hybrid service
- Servers/instances must be provisioned/configured ahead of time
CodeCommit
- Code needs to be stored somewhere (repository)
- Source-control service to host git-based repositories
CodeBuild
- Code building service in the cloud
- Compiles source code, run tests, and produces packages (ready to be used by CodeDeploy)
- Fully managed, serverless, secure, scalable, pay as you go
CodePipeline
- Orchestrate different steps to have code automatically pushed to production
- code -> build -> test -> provision -> deploy
- Basis for CICD (Continuous integration & continuous delivery)
- Commit -> build -> deploy -> Beanstalk
- Fully managed, and widely compatible
CodeArtifact
- Software packages depend on others to be built
- Storing and retrieving these dependencies is called artifact management
- Traditionally you need to setup your own management system
- Secure, scalable, cost-effective artifact management for development
- Works with common dependency management tools
- Developers and CodeBuild can retrieve dependencies straight from CodeArtifact
CodeStar (Replaced by CodeCatalyst)
- Unified UI to easily manage software deployment activities in one place
- Quick way to get started with CodeCommit, Pipeline, Build, deploy etc
- Can edit the code 'in-the-cloud' using AWS Cloud9
Cloud9
- Cloud IDE (integrated development environment)
- Cloud IDE can be used in your web browser, can work from anywhere with internet access, without setup needed
Systems Management (SSM)
- Helps manage EC2 and On-Premises systems at scale
- Hybrid AWS service
- Operational insights about the state of your infrastructure
- Suite of 10+ products
- Most important:
- Patching automation for enhanced compliance
- Run commands across an entire fleet of servers
- Store parameter configuration with the SSM parameter store
- Works for Linux, Windows, MacOS, and Raspberry Pi OS
How SSM Works
- Install SSM agent onto the system we control
- Installed by default on Amazon Linux AMI and some ubuntu AMI
- If an instance can't be controlled by SSM, likely an issue with the SSM agent
Systems Manager - SSM Session Manager
- Start a secure shell on EC2 and on-premises servers
- No SSH access, bastion hosts, or SSH keys needed
- No port 22 needed (better security)
- Support for Linux, MacOS, and Windows
- Send session log data to S3 or CloudWatch Logs
Systems Manager Parameter Store
- Secure storage for configuration and secrets
- API keys, passwords, configs etc
- Serverless, scalable, durable
- Control permissions via IAM
- Version tracking/encryption (optional)
Section 12: Leveraging the AWS Global Infrastructure
Why Make a Global Application
- Application deployed in multiple geographies
- This could be different regions and/or edge locations
- Decreased Latency:
- The time it takes for a network packet to reach a server
- Deploying application closer to users, will provide decreased latency
- Disaster Recovery
- If an AWS region goes down, you have backups
- Can failover to another region, and your application will continue working
- Attack Protection: Global infrastructure is harder to attack
Route 53
- Managed Domain Name System (DNS)
- DNS is a collection of rules/records which help clients understand how to reach a server through URLs
Routing Policy
- Simple routing policy with no health checks
- Weight routing policy with health checks, and can distribute some of the load
- Latency routing will determine location of connecting client, and redirect to server in the closest region where the application is deployed
- Failover routing, with disaster recovery. If the primary instances fails, it will route to the failover
CloudFront
- Content Delivery Network (CDN)
- Improves read performance, content is cached at edge locations
- Improves users experience
- 216 Points of Presence globally (edge locations)
- DDoS protection, integration with Shield, AWS Web Application Firewall
CloudFront - Origins
S3 Bucket
- Distributing files and caching them at the edge
- Enhanced security with CloudFront Origin Access Control (OAC)
- OAC is replacing Origin Access Identity (OAI)
- Can be used as an ingress (to upload files to S3)
Custom Origin (HTTP)
- Application load balancer
- EC2 instance
- S3 website (must first enable the bucket as a static S3 website)
- Any http backend you want
CloudFront vs S3 Cross Region Replication
CloudFront
- Global edge network
- Files are cache for a TimeToLive
- Great for static content which must be available everywhere
S3 Cross Region Replication
- Must be setup specifically for each region
- Updated in near real-time
- Read only
- Great for dynamic content which needs to be available at low-latency, in only a select few regions
S3 Transfer Acceleration
- Increase transfer speed by transferring file to a closer edge location, which can forward the data to the S3 bucket in the target region
- Between client and bucket
AWS Global Accelerator
- Improve global application availability and performance using the AWS global network
- Leverage the network to optimise the route to your application (up to 60% improvement)
- 2 Anycast IP are created for application, and traffic is sent through edge locations
- Edge location sends to your application
Global Accelerator vs CloudFront
Both
- AWS global network
- Edge locations
CloudFront
- Improves performance for cacheable content
- Content is served at the edge
Global Accelerator
- No caching
- Improves performance for a wide range of applications over TCP or UDP
- Good for HTTP use cases which require static IP addresses
- Good for HTTP use cases that require deterministic, fast regional failover
AWS Outposts
- Hybrid Cloud: On-premise infrastructure alongside a cloud infrastructure
- Therefore, two ways of dealing with IT systems
- One for the AWS cloud
- One for on-premises infrastructure
- Outposts are server racks, which offer the same infrastructure, services, APIs etc
- AWS will setup and manage these outposts
- You become responsible for the physical security
Benefits:
- Low latency
- Local data processing
- Data residency
- Easier data migration from on-premise to the cloud
- Fully managed service
- Some services from the ecosystem work on the outpost
Wavelength
- WaveLength Zones are infrastructure deployments embedded within the telecommunications providers' data centers at the edge of 5g networks
- Brings AWS services to the edge of 5g networks
- Ultra-low latency through 5g
- Traffic does not leave the Communication Service Providers network
- No additional charges/service agreements
Local Zones
- Places AWS compute, storage, database, and other selected AWS services closer to end users to run latency-sensitive applications
- Extend your VPC to more locations (extension of a region)
- Example:
- Region: N.Virginia
- Local Zone: Boston, Chicago, Dallas, Houston, etc.
Global Applications Architecture
Multi-Region, Active-Passive
- One active EC2 users can read/write to
- Other passive EC2 (in different region), users can read from
Multi-Region, Active-Active
- Multiple 'active' EC2 instances in different regions
- Users can/read from all
Section 13: Cloud Integrations
SQS - Simple Queue Service
- Many to many, multiple producers send messages to queue, queue polls messages to consumer, each one receiving a different message
Standard Queue:
- Oldest AWS offering
- Full managed service, used to decouple applications
- Scales from 1-10,000's of messages per second
- Default retention of messages: 4, maximum 14 days
- No limit to number of messages in queue
- Messaged deleted after being read by consumer
- Low latency
- Consumers share the work to read messages and scale horizontally
Amazon SQS - FIFO Queue
- FIFO: First in first out (message ordering in queue)
- Messaged processed in order by the customer
Kinesis
- Real-time big data streaming
- Managed service to collect, process, and analyze real-time streaming data at any scale
SNS
- Sending one message to many receivers
- Pub/sub
- Simple notification service
- Event publishers only sends message to one SNS topic
- Each subscriber gets all the messages
MQ
- SQS/SNS are cloud native services. Proprietary AWS protocols
- When migrating to the cloud, instead of re-engineering application to use SQS/SNS, we can use Amazon MQ
- Managed message broker service for RabbitMQ and ActiveMQ
- Doesn't scale as much as SQS/SNS
Section 14: Cloud Monitoring
CloudWatch Metrics
- Provides metrics for every service in AWS
- Metric is a variable to monitor (like CPU-Utilisation)
- Metrics have timestamps
Important Metrics
- EC2 Instances: CPU utilisation, status checks, network (not RAM)
- Every 5 minutes by default
- Pay for detailed monitoring (every 1 min)
- EBS Volumes: Disk read/writes
- S3 Buckets: BucketSizeBytes, NumberOfObjects, AllRequests
- Billing: Total Estimated Charge
- Service Limits: How much you've been using a service API
- Custom Metrics: Push your own metrics
CloudWatch Alarms
- Used to trigger notifications for any metric
- Alarms actions:
- Auto Scaling: Increase or decrease EC2 instances desired count
- EC2 Actions: Stop, terminate, reboot or recover an instance
- SNS Notifications: Send a notification into an SNS topics
- Various options
- Can choose the period on which to evaluate an alarm
- Example: Create billing alarm on the CloudWatch Billing Metric
CloudWatch Logs
- Collects logs from:
- Elastic beanstalk
- ECS
- Lambda
- CloudTrail
- CloudWatch Log Agents: On EC2 machines or on-premises servers
- Route53
- Enables real-time monitoring of logs
- Adjustable CloudWatch Logs retention
- By default, no logs from EC2 Instance will go to CloudWatch
- Need to run a CloudWatch agent on an EC2 instance to push the relevant log file
- Make sure IAM permissions are correct
- The agent can be setup on-premises as well
Event Bridge
- Used to be CloudWatch Events
- Schedule: Cron jobs (scheduled scripts)
- Example: Every hour, trigger script on lambda function
- Event Pattern: Event rules to react to a service doing something
- IAM Root user sign in -> SNS topic with email notification
- Schema Registry: Model event schema
- You can archive events (all/filter) sent to an event bus
- Ability to replay archived events
CloudTrail
- Governance, compliance and audit for your AWS account
- Enabled by default
- Get a history of events/API calls made within your AWS account
- Console
- SDK
- CLI
- AWS Services
- Can put logs from CloudTrail into CloudWatch Logs or S3
- Trail can be applied to All Regions (default) or a single region
- If a resource is deleted in AWS, investigate in CloudTrail first!
X-Ray
- Debugging in prod (lol!)
- Test locally
- Add log statements everywhere
- Re-deploy in production
- Log formats differ across applications and log analysis is hard
- Debugging: One big monolith 'easy', distributed services 'hard'
- No common views of entire architecture
Advantages:
- Troubleshooting performance
- Understand dependencies in a microservice architecture
- Pinpoint service issues
- Review request behaviour
- Find errors and exceptions
- Identify impacted users
CodeGuru
- Machine-learning service for automated code reviews, and application performance recommendations
- Provides two functionalities:
- CodeGuru Reviewer: Automated reviews for static code analysis (when you push to code commit for example), in dev
- CodeGuru Profiler: Visibility/recommendations about application performance during runtime (production)
Health Dashboard
Service History
- SHows all regions, all services health
- Shows historical information for each day
- Has an RSS feed you can subscribe to
- Previously called AWS Service Health Dashboard Information
Your Account
- Previously called Personal Health Dashboard
- Provides alerts/remediation guidance when AWS is experiencing events which may impact you
- Dashboard displays relevant and timely information to help manage said events
- Can aggregate data from an entire AWS organisation
- Global Service
Section 15: VPC & Networking
Virtual Private Cloud
IPv4 Addresses
- Charged at $0.005 per hour
- Free tier of 750 hours per month
- Public IPv4 - can be used on the Internet
- EC2 instance gets a new public IP address every time you stop then start it
- Private IPv4 - Can be used on private networks (LAN) such as internal AWS networking
- Private will be same for entire EC2 lifetime
- Elastic IP - Allows you to attach a fixed public IPv4 address to EC2 instance
IPv6 in AWS
- Internet protocol version 6
- Every IP address is public in AWS
- Example: 2001:db8:333:4444:cccc:dddd:eeee:ffff
VPC & Subnets Primer
- Virtual Private Cloud: Private network in which you can deploy your resources
- Subnets: Allow you to partition your network inside your VPC (AZ resource)
- Public: Subnet accessible from the internet
- Private: Not accessible from the internet
- To define access to the internet and between subnets, we use Route Tables
Internet Gateway & NAT Gateways
- Internet gateways helps our VPC instances connect with the internet
- Public subnets have a route to the internet gateway
- NAT Gateways (AWS-managed) & NAT Instances (self-managed), allow your instances in your Private Subnets to access the internet whilst remaining private
Network ACL & Security Groups
NACL (Netowork ACL)
- A firewall which controls traffic from and to subnet
- Can have ALLOW and DENY rules
- Are attached at the Subnet level
- Rules only include IP addresses
- Operates at the subnet level
- Is stateless
Security Groups
- A firewall that controls traffic to and from an EC2 instance
- Can have only ALLOW rules
- Rules include IP addresses and other security groups
- Operates at the instance level
- Is stateful
VPC Flow Logs
- Capture information about IP traffic going into your interfaces:
- VPC Flow Logs
- Subnet Flow Logs
- Elastic Network Interface Flow Logs
- Helps to monitor & troubleshoot connectivity issues
- Captures network information from AWS managed interfaces too: Elastic load balancers, ElastiCache, RDS, Aurora, etc...
- VPC Flow logs data can go to S3, CloudWatch Logs, and Kinesis Data Firehose
VPC Peering
- Connect two VPC, privately using AWS' network
- Make them behave as if they were in the same network
- Must not have overlapping CIDR (IP address range)
- VPC Peering connection is not transitive (each new VPC must have this established manually before communication)
VPC Endpoints
- Allow you to connect AWS services using a private network instead of the public www network
- Enhances security and lowers latency to AWS services
- VPC Endpoint Gateway: S3 & DynamoDB
- VPD Endpoint Interface: The rest
PrivateLink (VPC Endpoint Services Family)
- Secure and scalable way to expose a service to 1000's of VPC
- Does not require VPC peering, internet gateway, NAT, route tables
Site to Site VPN and Direct Connect
Site to Site VPN
- Connect an on-premise VPN to AWS
- The connection is automatically encrypted
- Goes over the public internet
- More:
- On-premises: Must use a Customer Gateway(CGW)
- AWS: Must use a Virtual Private Gateway (VGW)
Direct Connect (DX)
- Establish a physical connection between on-premises and AWS
- The connection is private, secure and fast
- Goes over a private network
- Takes at least a month to establish
Client VPN
- Connect from your computer using OpenVPN to your private network in AWS and on-premises
- Allows you to connect to your EC2 instances over a private IP (just as if you were in the private VPC network)
- Goes over public internet
Transit Gateway
- Useful for having transitive peering between thousands of VPC and on-premises
- Hub-and-spoke (star) connection
- One single gateway to provide this functionality
- Works with Direct COnnect Gateway, VPN connections
Section 16: Security & Compliance
Shared Responsibility Model
AWS:
- Protecting infrastructure (hardware, software, facilities, and networking) than runs all the AWS services
- Managed services like S3, DynamoDB, RDS, etc. Customer Responsibility - Security in the Cloud:
- For EC2 instance, customer is responsible for management of the guest OS (including security patches and updates), firewall & network configuration, IAM
- Encrypting application data Shared Controls:
- Patch management, Configuration management, awareness & training
DDoS
- Distributed Denial-of-Service
- Attacker 'spams' your application server with hundreds of bots
- The server becomes not accessible/responsive to legitimate users
Protections:
- AWS Shield Standard: Protects against DDOS attacks for your website and applications, for all customers at no additional cost
- Shield Advanced: 24/7 Premium DDOS protection
- AWS WAF: Filter specific requests based on rules
- CloudFront and Route53:
- Availability protection using global edge network
- Combined with AWS shield, provides attack mitigation at the edge
Shield
Standard
- Free service activated for each AWS customer
- Provides protection from attacks like SYN/UDP floods, reflection attacks and other layer 3/layer 4 attacks
Advanced
- Optional DDoS mitigation service ($3,000 per month, per organisation)
- Protect against more sophisticated attacks
- 24/7 access to AWS DDoS response team (DRP)
- Protect against higher fees during usage spikes due to DDoS
WAF - Web Application Firewall
- Protects your web apps from common web exploits (layer 7)
- Layer 7 is HTTP
- Deploy on Application Load Balancer, API gateway, CloudFront
- Define Web ACL (Access Control List)
- Rules can include IP addresses, HTTP headers, HTTP body, or URI strings
- Protects from common attack - SQL injection and Cross-Site Scripting (XSS)
- Size constraints, geo-match (block countries)
- Rate-based rules - for DDoS protection
Network Firewall
- Protect entire VPC
- From layer 3-7 protection
- Any direction, you can inspect
- VPC to VPC traffic
- Outbound to internet
- Inbound from internet
- To/from direct connect & site-to-site VPN
Firewall Manager
- Manage security rules in all accounts of AWS organisation
- Security policy: Common set of security rules
- VPC security groups for EC2, Application Load Balancer, etc...
- WAF Rules
- Shield Advanced
- Network Firewall
- Rules applied to new and future accounts in organisation, to new resources as they are created
Not allowed to perform DDoS attack on your own/others architecture
Data at Rest vs. Data in Transit
Rest:
- Stored or archived on a device
- Hard disk, RDS instance, S3 Glacier Deep archive etc
Transit:
- Data being moved from one location to another
- Transfer from on-premises to AWS, EC2 to DynamoDB etc.
- Data transferred on the network
Want to encrypt data in both states to protect in. We can use encryption keys.
KMS (Key Management Service)
- When you hear encryption for an AWS service, it is likely KMS
- AWS manages the encryption keys for us
- Encryption Opt-in:
- EBS volumes
- S3 Buckets
- Redshift database
- RDS database
- EFS drives
- Encryption Automatically Enabled:
- CloudTrail Logs
- S3 glacier
- Storage Gateway
CloudHSM
- AWS provisions encryption hardware, user manages key
- Dedicated hardware (Hardware security module)
Types of KMS Keys
Customer Managed Key
- Create, managed, and used by the customer, can enable or disable
- Possibility of rotation policy
- Able to bring-your-own-key
AWS Managed Key
- Created, managed, and used on the customers behalf by AWS
- Used by AWS services
AWS Owned Key
- Collection of CMKs that an AWS service owns and manages to use in multiple accounts
- AWS can use these to protect resources
CloudHSM Keys (Custom Keystore)
- Keys generated from your own CloudHSM device
- Cryptographic operations performed within the CloudHSM cluster
AWS Certificate Manager (ACM)
- Provision, manage, deploy SSL/TLS certificates
- Used to provide in-flight encryption fore websites (HTTPS)
- Supports both public/private TLS certificate
- Automatic TLS certificate renewal
Secrets Manager
- Meant for storing secrets
- Capability to force rotation of secrets
- Automatic generation on rotation (through Lambda)
- Encrypted using KMS
- Mostly means for RDS integration
AWS Artifact
- Portal to retrieve AWS compliance docs/AWS agreements
- Can be used to support internal audit or compliance
GuardDuty
- Intelligent Threat discovery to protect your AWS account
- Uses ML algorithms, anomaly detection and 3rd party data
- Wide spread of input data
- Can setup EventBridge rules to be notifies in case of findings
- Can protect against CryptoCurrency attacks (has a dedicated 'finding' for it)
Inspector
- Automated security assessments
- For EC2 instances
- Leveraging the AWS System Manager (SSM) agent
- Analyze against unintended network accessibility
- Analyze the running OS against known vulnerabilities
- For Container Images push to Amazon ECR
- Assessment of Container Images as they are pushed
- For Lambda Functions
- Identifies software vulnerabilities in function code and package dependencies
- Assessment of functions as they are deployed
- Reporting & integration with AWS security hub
- Send findings to Amazon event bridge
Continuous scanning of infrastructure, only when needed
AWS Config
- Helps with auditing and recording compliance of AWS resources
- Can store configuration data into S3 (analysed by Athena)
- Questions that Config solves:
- Is there unrestricted SSH access to my security groups?
- Do my buckets have public access
- How has my ALB config changed over time
- You can receive alerts (SNS) notifications for any changes
- Per-region service
Macie
- Managed data security/privacy services, uses ML and pattern matching to discover/protect your sensitive data
- Helps identify and alert you to sensitive data, personally identifiable information (PII)
Security Hub
- Central security tool, across several AWS accounts, automates security checks
- Integrated dashboards showing status, allows quick action
- Must first enable AWS Config Service
- Aggregates all alarms etc into one location (from config, macie, guardDuty etc)
Detective
- Sometimes security findings require a deeper analysis to isolate root cause
- Detective can do this for us (using ML and graphs)
- Automatically collect/process events from VPC flow logs, CloudTrail, GuardDuty, and create unified view
AWS Abuse
- Report suspected AWS resources used for abusive or illegal purposes
- Could be
- Spam
- Port scanning
- DoS DDoS
- Intrusion attempts
- Distributing malware
- Copyright content hosting
Root User Priv
- Lock AWS account root user access keys
- Change account settings
- Close AWS account
- Change/cancel AWS support plan
- Register as a seller in the Reserved Instance Marketplace
Access Analyser
- Find which resources are shared externally
- Define Zone of Trust = AWS Account or AWS organisation
- Access outside zone of trust => findings
Section 17: Machine Learning
Rekognition
- Find objects, people, text, scenes in images and videos using ML
- Facial analysis/search to do user verifications, people counting
- Create a DB of 'familiar faces'
- Use cases:
- Labeling
- Content moderation
- Text detection
- Face detection/analysis
- Face search and verification
- Celebrity recognition
Transcribe
- Converts speech to text
- Uses ASR (automatic speech recognition) to convert
- Automatically remove Personally Identifiable Information (PII) using redaction
- Supports automatic language identification for multi-lingual audio
Polly
- Turn text into lifelike speech using deep learning
- Create applications that talk
Translate
- Natural and accurate language translation
- Can localise content, like websites, for internation users
Lex & Connect
Amazon Lex (same tech as Alexa)
- Automatic speech recognition (ASR) to text
- Natural language understanding to recognise intent
- For chatbots or call center bots
Amazon Connect
- Receive calls, create contact flows, cloud-based virtual contact center
- Can integrate with other CRM systems or AWS
- No upfront payments, 80% cheaper that traditional contact center solutions
Comprehend
- For natural language processing - NLP
- Fully managed/serverless
- Uses machine learning to find insights and relationships in text
- Languages of text
- Extracts key phrases, places, people, brands or events
- Understands how positive or negative the text is
- Analyses text using tokenisation and parts of speech
- Automatically organises a collection of text files by topic
- Could be uses for customer interactions (emails), to find what makes a positive/negative experience
- Or create/group articles by topics which are uncovered
SageMaker
- Managed service for developers/data scientists to build ML models
- Typically difficult to do all processes in one place, you also need to provision servers
- Example: ML to predict exam score
- Get as much data as possible (historical like IT experience, time spent on course, practice exams etc)
- Label all the data
- Build ML model, predict scores from historical data
- Train ML model
Forecast
- Managed service, uses ML to delivery accurate forecasts
- Example: Predict future sales of a product
- 50% more accurate than simply looking at the data
Kendra
- Managed document search service, uses ML
- Extract answers from within a document (various file formats)
- Natural language search capabilities
- Learn from user interactions/feedback to promote preferred results (Incremental learning)
- Ability to manually fine-tune search results
Personalise
- Managed ML, to build apps with real-time personalised recommendations
- Same tech used by amazon.com
- Example: Personalised product recommendations on a e-commerce site
- Implement in days, not months
Textract
- Extracts text, handwriting, and data from any scanned documents using AI and ML
- Extract data from forms and tables
- Read and process any type of documents
- Uses cases:
- Financial services
- Healthcare
- Public sector
Section 18: Account Management, Billing & Support
Organisations
- Manage multiple AWS Accounts
- Main account is the master account
- Cost Benefits:
- Consolidated billing (only pay from one place)
- Pricing benefits from aggregated usage (volume discount for EC2, S3 ... )
- Pooling of reserved EC2 instances for optimal savings
- API is available to automate AWS account creation (AWS sandbox account for example)
- Restrict account privileges using Service Control Policies (SCP)
Multi Account
- Create accounts per department, prod/test/dev, restrictions etc
- Multi account vs One Account Multi VPC
- Enable CloudTrail on all account to be sent to central S3 account
- Same with CloudWatch
Service Control Policies (SCP)
- Whitelist/blacklist IAM actions
- Applied at organisational unit or account level
- Does not affect service-linked roles
- SCP must have an explicit Allow (doesn't allow anything be default)
- Does not apply to master account
- Use cases:
- Restrict access to certain services
- Enforce PCI compliance by explicity disabling services
Consolidated Billing
Combined Usage:
- Share volume pricing
- Reserved Instances and Savings Plans Discounts
- One combined bill for all the different accounts
Control Tower
- Easy way to set up and govern a secure/compliant multi-account AWS environment
- Benefits:
- Automate environment setup quickly
- Automate ongoing policy management
- Detect policy violations
- Monitor compliance
- Runs on top of AWS organisations
Resource Access Manager (RAM)
- Share resources owned by your account, with other AWS accounts
- With any account, or within your organisation
- Avoids resource duplication
Service Catalog
- AWS services are over-populated, new users will get overwhelmed
- Some need a self-service portal to launch a set of authorised product, pre-defined by admins
- Can use catalog for this. Essentially CloudFormation templates
Pricing Models
Pay as you go: Pay for what you use, remain agile, responsive, meet scale demands
Save When You Reserve: Minimise risk, predictably manage budgets, comply with long-term requirements. (Reservations for EC2 for example)
Pay Less By Using More: Volume-based discounts
Pay Less as AWS Grows: Economies of scale
Free Services/Tier
Compute Optimiser
- Reduce costs and improve performance by suggesting AWS resources for your workloads
- Helps to choose optimal configurations
- Uses ML to analyse resource configurations and their utilisation CloudWatch metrics
- Can lower costs up to 25%
Pricing Calculator
Enables you to estimate the cost of your solution architecture.
Tagging and Resource Groups
Used for organising resources:
- EC2: Instances, images, load balancers, security groups
- RDS, VPC resources, Route 53, IAM users, etc
- Resources created by CloudFormation are all tagged the same way
Free naming. Tags can be used to create Resource Groups
- Create, maintain and view a collection of resources which share common tags
Cost and Usage Reports
- Dive deeper into your AWS costs and usage
- This report provides the most comprehensive set of data available
- Can be integrated with Athena or Redshift
Cost Explorer
- Visualise, understand, and manage AWS costs/usage over time
- Create custom reports which analyse cost and usage data
- Choose an optimal savings plan
- Forecast usage up to 12 months based on previous usage!!
Billing Alarms
- Data metric stored in CloudWatch us-east-1
- For overall worldwide AWS costs
- For actual cost, not projected
Budgets
- Create budgets to send alarms when costs exceeds the budget
- 4 Types: Usage, Cost, Reservation, Savings Plans
- Up to 5 SNS notifications per budget
- 2 budgets are free, then $0.02 per day/budget
Cost Anomaly Detection
- Continuously monitor costs and usage using ML to detect unusual spends
- Learns unique history, spending patterns etc
- Sends detection report with root-cause analysis
Service Quotas
- Notify when your close to a threshold
- Example: Lambda concurrent executions
- Create CloudWatch alarms on the service quotas console
Trusted Advisor
- High level account assessment
- Analyse and provide recommendations on 6 categories:
- Cost optimisation
- Performance
- Security
- Fault Tolerance
- Service Limits
- Operational Excellence
Business & Enterprise Support Plan
- Full set of checks
- Programmatic access using AWS support API
Basic Support Plan
- 24x7 access to customer service, documentation, whitepapers and support forums
- AWS Trusted Advisor
- AWS Personal Health Dashboard
Developer Support Plan
- Business hours email access to Cloud Support Associates
- Unlimited cases / 1 primary contact
- Responses under 24 hours
Business Support Plan
- Full set of checks for Trusted Advisor
- 24x7 phone, email and chat access to Cloud Support Engineers
- Unlimited cases/unlimited contacts
- Access to Infrastructure Event Management (additional fee)
- Faster responses
Enterprise On-Ramp Support Plan
- Access to pool of Technical Account Managers (TAM)
- Concierge Support Team
- Fast response
Enterprise Support Plan
- Designated Technical Account Manager (TAM)
- 15 minutes response if business critical system is down
Section 19: Advanced Identity
STS (Security Token Service)
- Create temporary, limited-privilege credentials to access AWS resource
- Short-term credentials, you define the expiration period
- Use Cases:
- Identity federation: Manage user identities in external systems, and provide them with STS tokens to access AWS resources
- IAM Roles for cross/same account access
- IAM Roles for EC2: For Instances to access AWS resources
Cognito
- Web/Mobile application user identity
- You don't want to create IAM user, but instead a user in cognito
- Mobile/web applications then have integrated login to Cognito for authentication
Microsoft Active Directory (AD)
- On any Windows server with AD domain services
- Database of objects: User accounts, Computers, Printers, File Shares, Security Groups
- Centralised security management, create account, assign permissions
- AWS Directory Services
IAM Identity Center (Successor to Single Sign-on)
- One login (sso) for all:
- AWS Accounts in AWS organisations
- Business cloud applications
- EC2 Windows instances
- Identity Providers
- Built-ion store
- Or 3rd party
Section 20: Other Services
WorkSpaces
- Managed Desktop as a Service (DaaS) to provision Windows/Linux desktops
- Great to eliminate management of on-premise VDI (Virtual Desktop Infrastructure)
- Pay as you go, secure, fast and scalable
- Multiple Regions - place the workspace close to the user for low latency
Appstream 2.0
- Desktop Application Streaming Service
- Deliver to any computer, without acquiring or provisioning infrastructure
- Application delivered from within a web browser
- One application at a time
IoT Core
- Connect IoT devices to AWS cloud
- Serverless, secure & scalable
- Applications can communicate with devices even when not connected (Pub/Sub)
- Integrates with multiple AWS services
AppSync
- Store/sync data across mobile and web applications in real-time
- Uses GraphQL
- Client Code can be automatically generated
- Integrations with DynamoDB/Lambda
- Real-time subscriptions
- Offline data sync
- Fine Grained Security
Amplify
- Set of tools/services to help you develop and deploy scalable full stack web/mobile applications
- Authentication, storage, API, CI/CD, analytics, etc...
Application Composer
- Visually design/build serverless applications quickly
- Deploy infrastructure code without needing to be an AWS expert
- Configure how resources interact with each other
- Import/export CloudFormations
Device Farm
- Test application against desktop browsers, real mobile devices and tablets (not emulators)
- Run tests concurrently on multiple devices
- Configure device settings
Backup
- Manage and automate backups across different AWS services
- On-demand/scheduled backups
- PITR (Point-in-time-recovery)
Disaster Recovery Strategies
- Backup and restore is the cheapest
- ^ All we need to know for exam
Elastic Disaster Recovery (DRS)
- Quickly and easily recover your physical, virtual, and cloud-based server into AWS
- Example: Protect your most critical databases, enterprise apps, and protect data from ransomware attacks
- Continuous block-level replication for your servers
DataSync
- Move data from on-premise to AWS
- Replication tasks can be scheduled hourly, daily, weekly
- The replication tasks are incremental after the first full load
Cloud Migration Strategies: 7 R's
Retire
- Turn off things you don't need
- Save costs
Retain
- Do nothing for now (don't migrate)
Relocate
- Move app from on-premises to Cloud version
- Move EC2 instance to different VPC or region
Rehost "Lift and Shift"
- Simple migrations be re-hosting on AWS
- Migrate machines to Cloud
- No cloud optimisations, applications migrated 'as is'
Replatform "Lift and Reshape"
- Not changing core architecture, but leverage Cloud optimisations
- Save time/money by moving to full managed service or serverless
Repurchase "Drop and Shop"
- Move to different product while migrating to cloud
- Often you move to SaaS platform
- Expensive in the short term, but quick to deploy
Refactor/Re-Architect
- Reimagine how application is architected
- Driven by need to add/change features
- Move from monolithic application to micro-services
Application Discovery Service
- Plan migration projects by getting info of on-premise data centers
- Server utilisation is important
Agentless Discovery (Agentless Discovery Connector)
- VM inventory, configuration, and performance history
Agent-Based Discovery (Application Discovery Agent)
- System configuration, system performance, running processes, and details of the network connections between systems
Application Migration Service (MGN)
- Lift-and-Shift (rehost) solution, simplifies migration
- Converts physical, virtual, and cloud-based servers to run natively on AWS
- Wide range
Migration Evaluator
- Build data-driven business case for migration to AWS
- Snapshot of on-premises foot-print, server dependencies
- Analyse current state, define target state, then develop migration plan
Migration Hub
- Central location to collect servers and applications inventory data for assessment
- AWS Migration Hub Orchestrator - Provide pre-build templates to save time and effort migrating enterprise apps
- Supports migrations status updates from Application Migration Service (MGN) and Database Migration Service (DMS)
Fault Injection Simulator
- Based on Chaos Engineering - purposely stressing an application in a controlled environment and observing how a system responds
- Uncover hidden bugs and performance bottlenecks
- Use pre-build templates to generate desired disruptions
Step Functions
- Build serverless visual workflow to orchestrate Lambda functions
- Features: Sequence, parallel, conditions, timeouts, error handling
- Possibility of implementing human approval feature
Ground Station
- Control satellite communications, process data, and scale your satellite operations
- Global network of satellite ground stations near AWS regions
- Download satellite data to VPC, to be sent to S3 or EC2
Pinpoint
- 2-way marketing communications service (inbound/outbound)
- Supports email, SMS, push, voice, in-app messaging
- Ability to segment and personalise messages
- Can receive replies
Section 21: AWS Architecting, Ecosystem
Well Architected Framework General Guiding Principles
- Stop guessing capacity needs
- Test systems at production scale
- Automate, to make architectural experimentation easier
- Allow for evolutionary architectures
- Design based on changing requirements
- Drive architectures using data
Cloud Best Practices
Design Principles
- Scalability: Vertical and horizontal
- Disposable Resources: Servers should be disposable and easily configured
- Automation: Serverless, infrastructure as a Service, Auto scaling...
- Loose Coupling:
- Monolith are applications that do more and more over time (scope creep?)
- Break it down into smaller more loosely coupled components
- Services, not Servers
- Don't use just EC2
- Use managed services, databases, serverless, etc
Well architected framework pillars
- Operational Excellence
- Security
- Reliability
- Performance Efficiency
- Cost Optimisation
- Sustainability
Operational Excellence
- The ability to run and monitor systems to deliver business value and to continually improve supporting processes and procedures
- Design Principles:
- Perform operations as code - Infrastructure as code
- Make frequent, but small and reversible changes
- Refine operations over time
- Use managed services
Security
- The ability to protect information, assets, and systems whilst delivering business value through risk assessments and mitigations strategies.
- Design Principles:
- Implement a strong identity foundation (IAM)
- Enable tracing (logs and metrics for systems)
- Apply security at all layers
- Automate security best practices
- Protect data in transit and at rest
- Keep people away from data
- Prepare for security events
Reliability
- Ability to recover from disruptions, acquire computing resources dynamically, and mitigate disruptions
- Design Principles:
- Test recovery procedures
- Automatically recover from failure
- Scale horizontally to increase aggregate system availability
- Stop guessing capacity - use auto scaling
- Manage change in automation
Performance Efficiency
- The ability to use computing resources efficiently to meet requirements, and maintain efficiency as demand changes
- Design Principles:
- Go global quickly
- Use serverless architecture
- Experiment more often
- Mechanical sympathy - be aware of all AWS services
Cost Optimisation
- Run systems, but at the lowest price point
- Design Principles:
- Adopt a consumption mode - pay only for what you use
- Measure efficiency
- Analyse and attribute expenditure - Use tags
- Use managed and application level services to reduce cost of ownership
Sustainability
- Focus on minimising the environmental impacts of running cloud workloads
- Design Principles:
- Understand your impact
- Establish sustainability goals
- Maximise utilisation
- Use Managed services
- Reduce downstream impact of cloud workloads
AWS Well-Architected Tool
- Free tool. Review your architecture against the 6 pillars previously mentioned.
- How does it work?
- Select workload and answer questions
- Review your answers against the 6 pillars
- Obtain advice: Get videos and documentation, generate a report, see results in a dashboard
Customer Carbon Footprint Tool
- Track, measure, review and forecast the carbon emissions from your AWS usage
- Helps meet sustainability goals
Cloud Adoption Framework (CAF)
- A whitepaper (book)
- Helps build and execute a comprehensive plan for digital transformation with AWS
- Created by AWS professionals
- CAF identifies specific organisaton capabilities that underpin successful cloud transformations
- CAF groups its capabilities in six perspectives.
- Perspectives (Need for Exam!):
- Business
- People
- Governance
- Platform
- Security
- Operations
CAF Perspectives and Foundational Capabilities - Business Capabilities
Business Perspective: Helps to ensure that your cloud investments accelerate your digital transformation ambitions and business outcomes. People Perspective: Serves as a bridge between technology and business, accelerating the cloud journey to help organisations more rapidly evolve to a culture of continuous growth, learning, and where change becomes business-as-normal, with focus on culture, organisation structure, leadership, and workforce. Governance Perspective: Helps you orchestrate your cloud initiatives while maximising organisational benefits and minimizing transformation-related risks.
Business:
- Strategy management
- Portfolio management
- Innovation Management
- Product management
- Strategic Partnership
- Data Monetisation
- Business Insight
- Data science
People:
- Culture Evolution
- Transformational Leadership
- Cloud Fluency
- Workforce Transformation
- Change Acceleration
- Organisation Design
- Organisational Alignment
Governance
- Program and Project Management
- Benefits Management
- Risk Management
- Cloud Financial Management
- Application Portfolio Management
- Data governance
- Data curation
CAF Perspectives and Foundational Capabilities - Technical Capabilities
Platform Perspective: Helps you build an enterprise-grade, scalable, hybrid cloud platform; modernise existing workloads; and implement new cloud-native solutions. Security Perspective: Helps you achieve the confidentiality, integrity, and availability of your data and cloud networks. Operations Perspective: Helps ensure that your cloud services are delivered at a level that meets the needs of your business.
Platform:
- Platform Architecture
- Data Architecture
- Platform Engineering
- Data engineering
- Provisioning and orchestration
- Modern application development
- Continuous integration and continuous delivery
Security:
- Security governance
- Security assurance
- Identity and access management
- Threat detection
- Vulnerability management
- Infrastructure protection
- Data protection
- Application security
- Incident response
Operations
- Observability
- Event management (AIOps)
- Incident and problem management
- Change and release management
- Performance and capacity management
- Configuration management
- Patch management
- Availability and continuity
- Application management
CAF - Transformation Domains
- Technology: Use cloud to migrate/modernise legacy infrastructure, data and analytics platforms...
- Process: Digitising, automating, and optimising business operations
- Organisation: Re-imagining your operating model
- Product: Re-imagining your business model by creating new value propositions (products & services) and revenue models
CAF - Transformation Phases
- Envision: Demonstrate how the Cloud will accelerate business outcomes
- Align: Identify capability gaps in 6 CAF perspectives, results in action plan
- Launch: Build/deliver pilot initiatives in production and demonstrate incremental business value
- Scale: Expand pilot initiatives to the desired scale while realising the desired business benefits
Right Sizing
- EC2 has many instance types, the most powerful isn't always the best choice
- Scaling up is easy, so start small and scale up when needed
- Right sizing is the process of matching instance types and sizes to workload performance and capacity requirements
- Do it before cloud migration, and then continuously after (once a month for example), as requirements change
AWS Ecosystem - Training
- AWS Digital (online) and classroom training (in-person or virtual)
- AWS Private Training (for organisation)
- Training and Certification for the U.s Government
- Training and Certification for the Enterprise
- AWS Academy: Helps universities teach AWS
AWS IQ
- Quickly find professional help for AWS project
- Engage and Pay AWS certified 3rd party experts for on-demand work
- Video conferencing, contract management, collaboration etc
AWS re:Post
- Community forum
- Q&A service, offers crowd-sourced and expert-reviewed answers to your technical questions
- Free
- Is not intended to be used for questions which are time-sensitive or involve proprietary information
Knowledge Centre:
- Contains the most frequent and common questions and requests.
AWS Managed Services (AMS)
- Team of people who provide infrastructure and application support on AWS
- Security, reliability and availability is the focus
- Help organisations offload routine management tasks and focus on their business objectives
- 24/365