Azure Blob Storage
- Optimised for storing massive amounts of unstructured data
- Unstructured data is data that doesn't adhere to a particular data model or definition (ex. text or binary data)
- Use cases:
- Serving images or documents directly to a browser
- Storing files for distributed access
- Streaming video and audio
- Writing to log files
- Storing data for backup and restore, disaster recovery and archiving
- Storing data for analysis by an on-premises or Azure-hosted service
- Applications can access objects via HTTP/HTTPS through Azure Storage REST API, Azure PowerShell, Azure CLI or Azure Storage client library
- An Azure Storage account is the top-level container for all your Azure Blob storage
- Provides a unique namespace for your Azure Storage data with global accessibility
Types of Storage Accounts
- Two performance levels of storage accounts:
- Standard: Recommended for most scenarios
- Premium: Higher performance by using SSDs. Creating a premium account allows you to choose between three account types (block blobs, page blobs, file shares)
Type of storage account | Supported storage services | Redundancy options | Usage |
---|---|---|---|
Standard general-purpose v2 | Blob Storage (including Data Lake Storage), Queue Storage, Table Storage, and Azure Files | Locally redundant storage (LRS) / geo-redundant storage (GRS) / read-access geo-redundant storage (RA-GRS) | Standard storage account type for blobs, file shares, queues, and tables. Recommended for most scenarios using Azure Storage. If you want support for network file system (NFS) in Azure Files, use the premium file shares account type. |
Zone-redundant storage (ZRS) / geo-zone-redundant storage (GZRS) / read-access geo-zone-redundant storage (RA-GZRS) | |||
Premium block blobs | Blob Storage (including Data Lake Storage) | LRS and ZRS | Premium storage account type for block blobs and append blobs. Recommended for scenarios with high transaction rates or that use smaller objects or require consistently low storage latency. |
Premium file shares | Azure Files | LRS and ZRS | Premium storage account type for file shares only. Recommended for enterprise or high-performance scale applications. |
Premium page blobs | Page blobs only | LRS and ZRS | Premium storage account type for page blobs only. |
Access Tiers for Block Blob Data
- Different options available based off block blob data access usage patterns
- Each access tier is optimised for a particular pattern of data usage
- Hot:
- Optimised for frequent access of objects
- Highest storage cost
- Lowest access cost
- Default tier
- Cool:
- Optimised for storing large amounts of infrequently accessed data
- Stored for a minimum of 30 days
- Lower storage costs
- Higher access costs
- Cold:
- Optimised for storing infrequently accessed data
- Stored for a minimum of 90 days
- Lower storage costs and higher access costs compared to the Cool tier
- Archive:
- Available only for individual block blobs
- Optimised for data that can tolerate several hours of retrieval latency and remains in this tier for a minimum of 180 days
- Most cost-effective option for storing data
- More expensive to access data than Hot and Cool tiers
- Hot:
Storage Resource Types
- Blob storage offers the storage account, a container in the storage account and a blob in a container
- Storage accounts:
- Provides a unique namespace for your data
- Every stored object has an address including the unique account name
- Ex. of a default endpoint for Blob storage: http://mystorageaccount.blob.core.windows.net
- Containers:
- Organises a set of blobs, similar to a directory in a file system
- Unlimited containers in a storage account, storing an unlimited amount of blobs
- Container name must be a valid DNS name, forming part of the URI to address the container / blobs
- Rules to follow when naming a container:
- Between 3 and 63 characters long
- Start with a letter or number, containing only lowercase letters, numbers and the dash (-) character
- Two or more consecutive dash characters aren't permitted
- Ex. URI of a container: https://myaccount.blob.core.windows.net/mycontainer
- Blobs:
- Block blobs: Store text and binary data. Individual blocks of data forming the blobs can be managed individually, storing up to 190.7 TiB
- Append blobs: Made up of blocks like block blobs, optimised for append operations. Useful for logging data from VMs
- Page blobs: Store random access files up to 8 TB. Store virtual hard drive files and serve as disks for Azure VMs
- Ex. of a blob URI:
Azure Blob Security Features
- Use service-side encryption to automatically encrypt your data when persisted
- Azure Storage client libraries for Blob Storage and Queue Storage also provide client-side encryption for customers who need to encrypt data on the client
- Azure Storage encryption for data at rest
- Data is encrypted and decrypted transparently using 256-bit Advanced Encryption Standard (AES) encryption
- Enabled for all storage accounts by default and can't be disabled (don't need to modify your code or applications to take advantage of Azure Storage encryption)
- Data is encrypted regardless of performance tier, access tier, blob types or deployment model
- All Azure Storage redundancy options support encryption and all data in both the primary and secondary regions is encrypted when geo-replication is enabled
- All Azure Storage resources are encrypted, including blobs, disks, files, queues and tables
- All object metadata is also encrypted
- Encryption key management
- Data is encrypted with Microsoft-managed keys by default
- Two options for managing with your own key:
- Specify a customer-managed key to use for encrypting and decrypting data in Blob Storage and Azure Files
- Keys must be stored in Azure Key Vault or Azure Key Vault Managed Hardware Security Model
- Specify a customer-provided key on Blob Storage operations
- Clients can include an encryption key on a read/write request for granular control over how blob data is encrypted and decrypted
- Specify a customer-managed key to use for encrypting and decrypting data in Blob Storage and Azure Files
Key management parameter | Microsoft-managed keys | Customer-managed keys | Customer-provided keys |
---|---|---|---|
Encryption/decryption operations | Azure | Azure | Azure |
Azure Storage services supported | All | Blob Storage, Azure Files | Blob Storage |
Key storage | Microsoft key store | Azure Key Vault or Key Vault HSM | Customer's own key store |
Key rotation responsibility | Microsoft | Customer | Customer |
Key control | Microsoft | Customer | Customer |
Key scope | Account (default), container, or blob | Account (default), container, or blob | N/A |
- Client-side encryption
- Azure Blob Storage client libraries for .NET, Java and Python support data encryption with client apps before uploading to Azure Storage and decrypting data while downloading to the client
- Queue Storage client libraries for .NET and Python also support client-side encryption
- Both client libraries use AES for encryption
- Two versions of client-side encryption available for the libraries:
- Version 2 uses Galois/Counter Mode with AES
- Version 1 uses Cipher Block Chaining with AES
Azure Blob Storage Lifecycle
- Azure Blob Storage lifecycle management offers a rule-based policy to transition blob data to appropriate access tiers or to expire data at the end of its lifecycle
- The lifecycle management policy enables you to:
- Transition blobs from cool to hot immediately when accessed, optimising performance
- Transition current or previous blob versions or snapshots to a cooler storage tier if objects are not accessed frequently to optimise cost
- Delete current or previous blob versions or snapshots at the end of their lifecycles
- Apply rules to an entire storage account, containers or subset of blobs using name prefixes or blob index tags as filters
Lifecycle Management Policy
- A policy is a collection of rules in a JSON document
- Each rule definition within a policy includes a filter set and an action set
- Filter set limits rule actions to a set of objects within a container or objects names
- Action set applies the tier or delete actions to the filtered set of objects
- Ex. of a lifecycle management policy document:
{ "rules": [ { "name": "rule1", "enabled": true, "type": "Lifecycle", "definition": {...} }, { "name": "rule2", "type": "Lifecycle", "definition": {...} } ] }
- Each rule definition within a policy includes a filter set and an action set
Rules
- There must be at least one rule required in the policy, you can define up to 100 rules
- The rules parameter is an array of rule objects
Parameter name | Parameter type | Notes | Required |
---|---|---|---|
name | String | A rule name can include up to 256 alphanumeric characters. Rule name is case-sensitive. It must be unique within a policy. | True |
enabled | Booleanh | An optional boolean to allow a rule to be temporarily disabled. Default value is true. | False |
type | An enum value | The current valid type is Lifecycle. | True |
definition | An object that defines the lifecycle rule | Each definition is made up of a filter set and an action set. | True |
- Example:
- Tier blob to cool tier 30 days after last modification
- Tier blob to archive tier 90 days after last modification
- Delete blob 2,555 days (seven years) after last modification
- Delete blob snapshots 90 days after snapshot creation
{ "rules": [ { "enabled": true, "name": "sample-rule", "type": "Lifecycle", "definition": { "actions": { "version": { "delete": { "daysAfterCreationGreaterThan": 90 } }, "baseBlob": { "tierToCool": { "daysAfterModificationGreaterThan": 30 }, "tierToArchive": { "daysAfterModificationGreaterThan": 90, "daysAfterLastTierChangeGreaterThan": 7 }, "delete": { "daysAfterModificationGreaterThan": 2555 } } }, "filters": { "blobTypes": [ "blockBlob" ], "prefixMatch": [ "sample-container/blob1" ] } } } ] }
Rule Filters
- Filters limit rule actions to a subset of blobs within the storage account
- If more than one filter is defined, a logical AND runs on all filters
Filter name | Type | Required |
---|---|---|
blobTypes | An array of predefined enum values. | True |
prefixMatch | An array of strings for prefixes to be match. Each rule can define up to 10 prefixes. A prefix string must start with a container name. | False |
blobIndexMatch | An array of dictionary values consisting of blob index tag key and value conditions to be matched. Each rule can define up to 10 blob index tag condition. | False |
Rule Actions
- Actions are applied to filtered blobs when the run condition is met
- At least one action needs to be defined for each rule on blobs or blob snapshots
Action | Current Version | Snapshot | Previous Versions |
---|---|---|---|
tierToCool | Supported for blockBlob | Supported | Supported |
tierToCold | Supported for blockBlob | Supported | Supported |
enableAutoTierToHotFromCool | Supported for blockBlob | Not Supported | Not Supported |
tierToArchive | Supported for blockBlob | Supported | Supported |
delete | Supported for blockBlob and appendBlob | Supported | Supported |
- Run conditions are based on age
- Base blobs use the last modified time to track age
- Snapshots use the snapshot creation time to track age
Action run condition | Condition value | Description |
---|---|---|
daysAfterModificationGreaterThan | Integer value indicating the age in days | The condition for base blob actions |
daysAfterCreationGreaterThan | Integer value indicating the age in days | The condition for blob snapshot actions |
daysAfterLastAccessTimeGreaterThan | Integer value indicating the age in days | The condition for a current version of a blob when access tracking is enabled |
daysAfterLastTierChangeGreaterThan | Integer value indicating the age in days after last blob tier change time | The minimum duration in days that a rehydrated blob is kept in hot, cool, or cold tiers before being returned to the archive tier. This condition applies only to tierToArchive actions. |
Implement Blob storage lifecycle policies
- A lifecycle management policy must be read or written in full
- Partial updates aren't supported
- Add, edit or remove policies via:
- Azure Portal
- Navigate to storage account, under Data Management, select Lifecycle Management to view and edit the policy
- Azure PowerShell
- Azure CLI
- Write the policy to a JSON file then call az storage account management-policy create command to create the policy
az storage account management-policy create \ --account-name <storage-account> \ --policy @policy.json \ --resource-group <resource-group>
- REST APIs
- Azure Portal
Rehydrate Blob Data from the Archive Tier
- Blobs in the archive access tier are considered offline and can't be read or modified
- It must be hydrated to become online, either in the hot or cool tier
- Process can take several hours, recommended to rehydrate larger blobs for optimal performance (several small blobs might require extra time)
- Two options for hydration:
- Copy an archived blob to an online tier:
- Rehydrate the blob by copying it to a new blob in the hot or cool tier
- Use the Copy Blob or Copy Blob from URL operations
- Change a blob's access tier to an online tier:
- Rehydrate an archived blob to hot or cool by changing its tier using the Set Blob Tier operation
- Copy an archived blob to an online tier:
- Optionally, the x-ms-rehydrate-priority header can be set to manage the priority for the mentioned rehydration operations
- Standard priority: Request is processed in the order it was received, taking up to 15 hours
- High priority: Request is prioritised over standard priority requests, might complete in under one hour for objects under 10 GB in size
- NOTE: Changing a blob's tier doesn't affect its last modified time
- If there is a lifecycle management policy in effect for the storage account, then rehydrating a blob with Set Blob Tier can result in a scenario where the lifecycle policy moves the blob back to the archive tier after rehydration because the last modified time is beyond the threshold set for the policy
Azure Blob storage client library
Class | Description |
---|---|
BlobClient | The BlobClient allows you to manipulate Azure Storage blobs. |
BlobClientDescription | Provides the client configuration options for connecting to Azure Blob Storage. |
BlobClientClient | The BlobContainerClient allows you to manipulate Azure Storage containers and their blobs. |
BlobServiceClient | The BlobServiceClient allows you to manipulate Azure Storage service resources and blob containers. The storage account provides the top-level namespace for the Blob service. |
BlobUriBuilder | The BlobUriBuilder class provides a convenient way to modify the contents of a Uri instance to point to different Azure Storage resources like an account, container, or blob. |
Creating Objects
- Client object
- Interacts with three types of resources in the storage service: storage accounts, containers and blobs
- Pass a URI referencing the endpoint to the client constructor
- BlobServiceClient object:
- Allows your app to interact with resources at the storage account level
- Provides methods to retrieve and configure account properties
- Can also list, create and delete containers within the storage account
using Azure.Identity; using Azure.Storage.Blobs; public BlobServiceClient GetBlobServiceClient(string accountName) { BlobServiceClient client = new( new Uri($"https://{accountName}.blob.core.windows.net"), new DefaultAzureCredential()); return client; }
- BlobContainerClient object:
- Provides methods to create, delete or configure a container
- Can also list, upload and delete blobs within the container
public BlobContainerClient GetBlobContainerClient( BlobServiceClient blobServiceClient, string containerName) { // Create the container client using the service client object BlobContainerClient client = blobServiceClient.GetBlobContainerClient(containerName); return client; }
- BlobClient object:
- Interact with a specific blob resource
public BlobClient GetBlobClient( BlobServiceClient blobServiceClient, string containerName, string blobName) { BlobClient client = blobServiceClient.GetBlobContainerClient(containerName).GetBlobClient(blobName); return client; }
Manage Container Properties and Metadata by Using .NET
- Blob containers support system properties and user-defined metadata as well
- System properties exist on each Blob storage resource, some correspond to certain standard HTTP headers
- User-defined metadata consists of one or more name-value pairs specified for a Blob resource that you can use for your own purpose
- Retrive container properties
- Call GetProperties or GetPropertiesAsync on the BlobContainerClient class
- Set and retrieve metadata
- Call SetMetadata or SetMetadataAsync on the BlobContainerClient class, passing it an IDictionary object containing name-value pairs
- Using REST:
- Metadata header format: x-ms-meta-name:string-value
- Total size of all metadata pairs can be up to 8 KB in size
- All pairs are valid HTTP headers, adhering to all restrictions governing HTTP headers
- Operations on metadata:
- Retrieving properties and metadata:
- Setting metadata headers:
- Standard HTTP properties for containers and blobs:
- Containers:
- ETag
- Last-Modified
- Blobs:
- ETag
- Last-Modified
- Content-Length
- Content-Type
- Content-MDS
- Content-Encoding
- Content-Language
- Cache-Control
- Origin
- Range
- Containers: