Graduate Program KB

EC2 Instance Storage

  • Elastic Block Store (EBS) Volume: A network drive you can attach to instances to allow them to persist data

    • Data persists even after termination
    • Bound to a specific AZ (volume must be snapshotted first before moving to another AZ)
    • Mounted to one instance at a time
    • Free tier: 30 GB of EBS storage per month (either General Purpose (SSD) or Magnetic)
    • Network drives communicate to the instance via a network, there could be some latency
    • Set a provisioned capacity (size in GBs and IOPS), will get billed for this provisioned capacity. The drive capacity can be increased over time
  • Delete on Termination attribute (EC2 --> Instances --> Storage):

    • Controls EBS behaviour when an EC2 instance terminates
      • Root EBS volume is deleted by default (attribute automatically enabled)
      • Other attached EBS volumes are not deleted by default (attribute automatically disabled)
    • Controllable through AWS console / CLI
    • Useful scenario: Preserve root volume when instance is terminated

EBS Snapshots

  • Create a backup of your EBS volume at a point in time (snapshot)

  • Recommended to detach a volume before creating a snapshot, but is not a requirement

  • Snapshots can be copied across AZ or Regions

  • EBS Snapshot Features

    • EBS Snapshot Archive
      • Move a snapshot to an archive tier, 75% cheaper
      • Takes between 24 to 72 hours for restoring the archive
    • Recycle Bin for EBS Snapshots
      • Setup rules to retain deleted snapshots, allows you to recover them after accidental deletion
      • Retention can be specified between 1 day to 1 year

AMI

  • Amazon Machine Image: A customisation of an EC2 instance (with your own software, configuration, OS, monitoring, etc.)

    • All the pre-packaged software results in a faster boot / configuration time
    • Region specific, can also be copied across regions
    • EC2 instances can be launched from:
      • Public AMI provided from AWS
      • Your own AMI you create and maintain yourself
      • AWS Marketplace AMI made from someone else
  • AMI Process

    • Start an EC2 instance and customise it
    • Stop the instance (preserve data)
    • Build an AMI (will also create an EBS snapshot)
    • Launch instances from other AMIs (Application --> OS Images)
  • EC2 Image Builder

    • Automate the creation, maintainence and testing of EC2 AMIs
    • Can be run on a schedule, when packages are updated or depending on the defined specification
    • Free service, only play for underlying resources
  • EC2 Instance Store

    • High-performance hardware disk connected to EC2 storage
      • Better I/O performance
      • EC2 Instance Store lose their storage if stopped (ephemeral)
      • Good for buffer / cache / scratch data / temporary content
      • Risk of data loss if hardware fails
      • Backups and replication are your responsibility
    • Alternative to network drives (EBS volumes) which have limited performance

EFS

  • Elastic File System: Managed NFS (network file system) that can be mounted to many Linux EC2 instances across multiple AZ

    • Highly available, scalable, expensive, pay per use, no capacity planning
    • Attaches via an EFS Mount Target
  • EFS Infrequent Access (EFS-IA): Cost-optimised storage class for files not accessed everyday

    • Up to 92% lower cost compared to EFS Standard
    • EFS automatically moves your files to EFS-IA based on time last accessed, which is enabled and defined with a Lifecycle Policy
    • Transparent to the applications accessing EFS

Shared Responsibility Model for EC2 Storage

  • AWS responsibilities:
    • Infrastructure
    • Replication for data for EBS volumes & EFS drives
    • Replacing faulty hardware
    • Ensuring their employees can't access your data
  • User responsibilities:
    • Setting up backup / snapshot procedures
    • Setting up data encryption
    • Responsibility of any data on the drives
    • Understanding the risk of using EC2 Instance Store

Amazon FSx

  • Launch alternative third-party high-performance file systems on AWS
    • FSx for Lustre: Fully managed, high-performance and scalable file storage for High Performance Computing (HPC)
      • Name derived from "Linux" + "cluster"
      • Useful for machine learning, analytics, video processing, financial modeling, etc.
      • Scales up to hundreds of GB/s and millions of IOPS
    • FSx for Windows File Server: Fully managed, highly reliable and scalable Windows native shared file system
      • Built on Windows File Server
      • Supports SMB protocol & Windows NTFS
      • Integrated with Microsoft Active Directory
      • Can be accessed from AWS or your on-premise infrastructure
    • FSx for NetApp ONTAP
  • Alternative to EFS and S3