Graduate Program KB

Section 11 - AWS Storage Extras

Snow Family introduction! Refer to Section 06 - S3

  • For SAA, Snowball cannot import to Glacier directly
    • Must use S3 first in combination with an S3 lifecycle policy

Amazon FSX

  • Fully managed service to launch 3rd party high-performance file systems on AWS

FSx for Windows

  • Fully managed WIndows file system share drive
  • Supports SMB protocol and Windows NTFS
  • Integrates with Microsoft Active Directory, ACLs and user quotas
  • Mountable to Linux EC2 instances
  • Supports Microsoft's Distributed File System Namespaces
  • Scales up to 10s of GB/s, millions of IOPS, 100s PB of data
  • Storage options:
    • SSD for latency sensitive workloads
    • HDD for broad spectrum workloads
  • Can be accessed from on-premises infrastructure (VPN / Direct Connect) and configurable to be Multi-AZ for high availability
  • Data is backed-up daily to S3

FSx for Lustre

  • Lustre is a type of parallel distributed file system for large-scale computing
  • High performance computing, useful for video processing, financial modelling, electronic design automation, etc.
  • Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
  • Storage options:
    • SSD: Low-latency, IOPS intensive workloads, small and random file operations
    • HDD: Throughput-intensive workloads, large and sequential file operations
  • Seamless integration with S3
  • Can be used from on-premises servers (VPN / Direct Connect)
  • Deployment options:
    • Scratch File System
      • Temporary storage
      • Data is not replicated
      • High burst rate (6x faster)
      • Useful for short-term processing to optimize costs
    • Persistent File System
      • Long-term storage
      • Data is replicated within same AZ
      • Replace failed files in minutes
      • Useful for long-term processing and sensitive data

FSx for NetApp ONTAP

  • Managed NetApp ONTAP on AWS
  • Compatible with NFS, SMB, iSCSI protocol
  • Move workloads running on ONTAP or NAS to AWS
  • Works with Linux, Windows, MacOS, VMware Cloud, Workspaces & AppStream 2.0, EC2, ECS and EKS
  • Storage shrinks or grows automatically
  • Snapshots, replication, low-cost, compression and data de-duplication
  • Point-in-time instantaneous cloning which is useful for testing new workloads

FSx for OpenZFS

  • Managed OpenZFS file system on AWS
  • Compatible with NFS
  • Move workloads running on ZFS to AWS
  • Works with Linux, Windows, MacOS, VMware Cloud, Workspaces & AppStream 2.0, EC2, ECS and EKS
  • Up to one million IOPS with less than 0.5ms latency
  • Snapshots, compression and low-cost
  • Point-in-time instantaneous cloning which is useful for testing new workloads

Hybrid Cloud

  • Part infrastructure on cloud and on-premises
  • Factors: Long cloud migrations, security requirements, compliance requirements and IT strategy
  • Use AWS Storage Gateway to expose S3 data on-premises

AWS Storage Gateway

  • Bridge between on-premises data and cloud data
  • Useful for disaster recovery, backup & restore, tiered storage, on-premises cache and low-latency files access
  • Types: S3 File Gateway, FSx File Gateway, Volume Gateway, Tape Gateway

S3 File Gateway

  • Configured S3 buckets accessible using NFS and SMB protocol
  • Caches most recently used data in file gateway
  • Supports S3 Standard, Standard IA, One Zone A and Intelligent Tiering
    • Transition to Glacier using a lifecycle policy
  • Use IAM roles for bucket access for each file gateway
  • SMB protocol has integration with Active Directory for user authentication

FSx File Gateway

  • Native access to FSx for Windows File Server
  • Local cache for frequently accessed data
  • Windows native compatibility
  • Useful for group file shares and home directories

Volume Gateway

  • Block storage using iSCSI protocol backed by S3
  • Backed by EBS snapshots to help restore on-premises volumes
  • Cached volumes provide low-latency access to most recent data
  • Stored volumes are the entire dataset on-premise, has scheduled backups to S3

Tape Gateway

  • Some companies use physical tapes for the backup process, Tape Gateway use the same process but in the cloud
  • Virtual Tape Library is backed by S3 and Glacier
  • Data is backed up using existing tape-based processes and iSCSI interface
  • Works with leading backup software vendors

Storage Gateway - Hardware Appliance

  • Alternative to on-premises virtualization when using Storage Gateway, available for purchase at Amazon
  • Works with all gateway types and has he required CPU, memory, network and SSD cache resources
  • Useful for daily NFS backups in small data centres

AWS Transfer Family

  • Fully managed service for file transfers into and out of S3 or EFS using the FTP protocol
  • Supported protocols: AWS Transfer for FTP, FTPS and SFTP
  • Scalable, reliable, highly available (Multi-AZ)
  • Pay per provisioned endpoint per hour + data transfers in GB
  • Store and manager users' credentials
  • Integrate with existing authentication systems
  • Useful for sharing files, public datasets, CRM, ERP, etc.

AWS DataSync

  • Can move large amounts of data between:
    • On-premises to cloud (requires agent)
    • AWS to AWS (different storage services)
  • Can synchronize to S3 , EFS or FSx
  • Replication tasks can be scheduled hourly, daily or weekly
  • File permissions and metadata are preserved
  • One agent task can use 10 Gbps, can setup a bandwidth limit