Azure OpenAI Service

Enables you to develop AI solutions that benefit from the security, scalability and integration of other services in the Azure cloud platform

Creating an Azure OpenAI resource

Can create only two Azur OpenAI resources per region
NOTE: Certain models are only available in specific regions
Azure Portal
- Navigate to Home > Cognitive Services | Azure OpenAI
- Provide a subscription name, resource group name, region, unique instance name and select a pricing tier
Azure CLI (example)
- MyOpenAIResource: Unique name for the resource
- OAIResourceGroup: Resource group name
- eastus: Region to deploy your resource
- subscriptionID: Replace with your subscriptionID
```
az cognitiveservices account create \
-n MyOpenAIResource \
-g OAIResourceGroup \
-l eastus \
--kind OpenAI \
--sku s0 \
--subscription subscriptionID
```

Using Azure AI Studio

Accessible via Azure Portal after creating an Azure OpenAI resource
Provides access to model management, deployment, experimentation, customisation and learning resources
Deploy a model in the Deployments section

Types of Generative AI Models

In Azure AI Studio, go to Models Catalog to view the available base models
Types of base models:
- GPT-4: Latest generation of generative pre-trained GPT models that can generate natural language and code completions
- GPT 3.5: Similar to GPT-4 but especially GPT-35-turbo models are optimised for chat-based interactions
- Embeddings models: Convert text into numeric vectors and useful in language analytics scenarios such as comparing text sources for similarity
- DALL-E models: Used to generate images based on natural language prompts. Only in preview currently, don't need to be explicitly deployed

Deploy Generative AI Models

Deploy model chat or make API calls to receive response to prompts
- Select a base model
- Deploy any number of models as long as it's witihn your quota
Using AI Studio:
- In Deployments page, select a model

Using CLI (example):

az cognitiveservices account deployment create \
    -g OAIResourceGroup \
    -n MyOpenAIResource \
    --deployment-name MyModel \
    --model-name gpt-35-turbo \
    --model-version "0301"  \
    --model-format OpenAI \
    --sku-name "Standard" \
    --sku-capacity 1

Using REST:
- https://learn.microsoft.com/en-us/azure/ai-services/openai/

Using Prompts

A prompt is the text portion of a request that is sent to the deployed model's endpoint
Response are completions in the format of text, code, images, etc.
Completion quality can be affected by:
- The way a prompt is engineered
- Model parameters
- Data the model is trained on
Make calls via the Rest API, Python, C# or from AI Studio

AI Studio Playground

An interface for experimenting with deployed models without needing to develop your own client application
Playground parameters:
- Temperature: Controls randomness. Lower temperature means more repetitive and deterministic responses
- Max length (tokens): Limit on a number of tokens per model response
- Stop sequences: Make responses stop at a desired point (ex. end of sentence or list)
- Top probabilities (Top P): Controls randomness but using a different method
- Frequency penalty: Reduce the chance of repeating a token based off how often it's appeared so far
- Presence penalty: Reduce the chance of repeating a token based off how often it's appeared in the overall text so far
- Pre-response text: Insert text after the user's input and before the model's response
- Post-response text: Insert text after the model's generated response to encourage further user input