Graduate Program KB

Section 24 - More Solutions Architecture

S3 Event Notifications with Amazon EventBridge

  • Use advanced filtering options with JSON rules
  • Send events to multiple locations (AWS services)
  • Use EventBridge capabilities (archive, replay events, reliable delivery)

High Performance Computing

  • The cloud is perfect for creating a larger number of scalable resources quickly, paying only for what you used
  • Perform genomics, computational chemistry, financial risk modeling, weather prediction, ML, deep learning, autonomous driving, etc.

Data Management & Transfer

  • Direct Connect for moving GB/s of data to the cloud over a private, secured network
  • Snowball and Snowmobile for moving PB of data to the cloud
  • DataSync for moving large amount of data between on-premises and S3, EFS, FSx for Windows

Compute and Networking

  • EC2 instances can be CPU/GPU optimized, use spot instances / fleets for cost savings and auto scaling
  • EC2 placement groups are instance clusters for good network performance
  • EC2 enhanced networking (SR-IOV) provides higher bandwidth, higher packets per second and lower latency
    • Elastic Network Adapter up to 100 Gbps or Intel 82599 VF (legacy) for up to 10 Gbps
  • Elastic Fabric Adapter is an improved ENA for HPC (only for Linux)
    • Great for inter-node communications and tightly coupled workloads
    • Leverages Message Passing Interface standard
    • Bypasses underlying Linux OS to provide low-latency and reliable transport

Storage

  • Instance-attached storage
    • EBS scales up to 256000 IOPS with io2 Block Express
    • Instance store linked to an EC2 instance scales to millions of IOPS with low latency
  • Network storage
    • S3 for large blobs, it's not a file system
    • EFS for scaling IOPS based on total size or use provisioned IOPS
    • FSx for Lustre is great for HPC optimized distributed file systems, scaling to millions of IOPS and is also backed by S3

Automation and Orchestration

  • AWS Batch
    • Supports multi-node parallel jobs which enables you to run single jobs spanning multiple EC2 instances
    • Can easily schedule jobs and launch EC2 instances accordingly
  • AWS ParallelCluster
    • An open source cluster management tool for deploying HPC on AWS
    • Configurable with text files
    • Can automate creation of VPC, subnet, cluster type and instance types
    • Can enable EFA on the cluster to improve network performance