
Azure provides the OpenAI service to address the concerns for companies and government agencies that have strong security regulations but want to leverage the power of AI as well.
Most likely you’ve used one of the many AI offerings out there. Open AI’s ChatGPT, Google Bard, Google PaLM with MakerSuite, Perplexity AI, Hugging Chat and many more have been in the latest hype and software companies are racing to integrate them into their products. The main way is to buy a subscription and connect to the ones that offer their API over the internet but as an DevSecOps engineer here’s where the fun starts.
A lot of companies following good security practices block traffic to and from the internet so the first part of all this will be to open the firewall. Next you must protect the credentials of the API user so that it doesn’t get hacked and access will reveal what you are up to. Then you have to trust that OpenAI is not using your data to train their models and that they are keeping your company’s data safe.
It could take a ton of time to plan, design and deploy a secured infrastructure for using large language models and unless you have a very specific use case it might be overkill to build your own.
Here’s a breakdown of a few infrastructure highlights about this service.
3 Main Features
Privacy and Security
Your Chat-GPT like interface called Azure AI Studio runs in your private subscription. It can be linked to one of your VNETs so that you can use internal routing and you can also add private endpoints so that you don’t even have to use it over the internet.

Even if you have to use it over the internet you can lock it down to only allow your public IPs and your developers will need a token for authentication as well that can be scripted to rotate every month.

Pricing

Common Models
- GPT-4 Series: The GPT-4 models are like super-smart computers that can understand and generate human-like text. They can help with things like understanding what people are saying, writing stories or articles, and even translating languages.
Key Differences from GPT-3:
- Model Size: GPT-4 models tend to be larger in terms of parameters compared to GPT-3. Larger models often have more capacity to understand and generate complex text, potentially resulting in improved performance.
- Training Data: GPT-4 models might have been trained on a more extensive and diverse dataset, potentially covering a broader range of topics and languages. This expanded training data can enhance the model’s knowledge and understanding of different subjects.
- Improved Performance: GPT-4 models are likely to demonstrate enhanced performance across various natural language processing tasks. This improvement can include better language comprehension, generating more accurate and coherent text, and understanding context more effectively.
- Fine-tuning Capabilities: GPT-4 might introduce new features or techniques that allow for more efficient fine-tuning of the model. Fine-tuning refers to the process of training a pre-trained model on a specific dataset or task to make it more specialized for that particular use case.
- Contextual Understanding: GPT-4 models might have an improved ability to understand context in a more sophisticated manner. This could allow for a deeper understanding of long passages of text, leading to more accurate responses and better contextual awareness in conversation.
- GPT-3 Base Series: These models are also really smart and can do similar things as GPT-4. They can generate text for writing, help translate languages, complete sentences, and understand how people feel based on what they write.
- Codex Series: The Codex models are designed for programming tasks. They can understand and generate computer code. This helps programmers write code faster, get suggestions for completing code, and even understand and improve existing code.
- Embeddings Series: The Embeddings models are like special tools for understanding text. They can turn words and sentences into numbers that computers can understand. These numbers can be used to do things like classify text into different categories, find information that is similar to what you’re looking for, and even figure out how people feel based on what they write.
Getting Access to it!
Although the service is Generally Available (GA) it is only available in East US and West Europe. You also have to submit an application so that MS can review your company and use case so they can approve or deny your request. This could be due to capacity and for Microsoft to gather information on how companies will be using the service.
The application is here: https://aka.ms/oai/access
Based on research and experience getting this for my clients I always recommend only pick what you initially need and not get too greedy. It would be also wise to speak with your MS Rep and take them out for a beer! For example if you just need code generation then just select the codex option.
Lately getting the service has been easier to get, hopefully soon we won’t need the form and approval dance.
