Azure OpenAI Service
- Enables you to develop AI solutions that benefit from the security, scalability and integration of other services in the Azure cloud platform
Creating an Azure OpenAI resource
- Can create only two Azur OpenAI resources per region
- NOTE: Certain models are only available in specific regions
- Azure Portal
- Navigate to Home > Cognitive Services | Azure OpenAI
- Provide a subscription name, resource group name, region, unique instance name and select a pricing tier
- Azure CLI (example)
- MyOpenAIResource: Unique name for the resource
- OAIResourceGroup: Resource group name
- eastus: Region to deploy your resource
- subscriptionID: Replace with your subscriptionID
az cognitiveservices account create \ -n MyOpenAIResource \ -g OAIResourceGroup \ -l eastus \ --kind OpenAI \ --sku s0 \ --subscription subscriptionID
Using Azure AI Studio
- Accessible via Azure Portal after creating an Azure OpenAI resource
- Provides access to model management, deployment, experimentation, customisation and learning resources
- Deploy a model in the Deployments section
Types of Generative AI Models
- In Azure AI Studio, go to Models Catalog to view the available base models
- Types of base models:
- GPT-4: Latest generation of generative pre-trained GPT models that can generate natural language and code completions
- GPT 3.5: Similar to GPT-4 but especially GPT-35-turbo models are optimised for chat-based interactions
- Embeddings models: Convert text into numeric vectors and useful in language analytics scenarios such as comparing text sources for similarity
- DALL-E models: Used to generate images based on natural language prompts. Only in preview currently, don't need to be explicitly deployed
Deploy Generative AI Models
- Deploy model chat or make API calls to receive response to prompts
- Select a base model
- Deploy any number of models as long as it's witihn your quota
- Using AI Studio:
- In Deployments page, select a model
- Using CLI (example):
az cognitiveservices account deployment create \ -g OAIResourceGroup \ -n MyOpenAIResource \ --deployment-name MyModel \ --model-name gpt-35-turbo \ --model-version "0301" \ --model-format OpenAI \ --sku-name "Standard" \ --sku-capacity 1
- Using REST:
Using Prompts
- A prompt is the text portion of a request that is sent to the deployed model's endpoint
- Response are completions in the format of text, code, images, etc.
- Completion quality can be affected by:
- The way a prompt is engineered
- Model parameters
- Data the model is trained on
- Make calls via the Rest API, Python, C# or from AI Studio
AI Studio Playground
- An interface for experimenting with deployed models without needing to develop your own client application
- Playground parameters:
- Temperature: Controls randomness. Lower temperature means more repetitive and deterministic responses
- Max length (tokens): Limit on a number of tokens per model response
- Stop sequences: Make responses stop at a desired point (ex. end of sentence or list)
- Top probabilities (Top P): Controls randomness but using a different method
- Frequency penalty: Reduce the chance of repeating a token based off how often it's appeared so far
- Presence penalty: Reduce the chance of repeating a token based off how often it's appeared in the overall text so far
- Pre-response text: Insert text after the user's input and before the model's response
- Post-response text: Insert text after the model's generated response to encourage further user input