Data Science

OpenWebUI & Ollama: Experience AI on Your Terms with Local Hosting

In an era where AI solutions are behind subscription models behind cloud-based solutions, OpenWebUI and Ollama provide a powerful alternative that prioritize privacy, security, and cost efficiency. These open-source tools are revolutionizing how organizations and individuals can harness AI capabilities while maintaining complete control of models and data used.

Why use local LLMs? #1 Uncensored Models

One significant advantage of local deployment through Ollama is the ability to use a model of your choosing which includes unrestricted LLMs. While cloud-based AI services often implement various limitations and filters on their models to maintain content control and reduce liability, locally hosted models can be used without these restrictions. This provides several benefits:

Complete control over model behavior and outputs
Ability to fine-tune models for specific use cases without limitations
Access to open-source models with different training approaches
Freedom to experiment with model parameters and configurations
No artificial constraints on content generation or topic exploration

This flexibility is particularly valuable for research, creative applications, and specialized industry use cases where standard content filters might interfere with legitimate work.

Here’s an amazing article from Eric Hartford on: Uncensored Models

Why use local LLMs? #2 Privacy

When running AI models locally through Ollama and OpenWebUI, all data processing occurs on your own infrastructure. This means:

Sensitive data never leaves your network perimeter
No third-party access to your queries or responses
Complete control over data retention and deletion policies
Compliance with data sovereignty requirements
Protection from cloud provider data breaches

Implementation

Requirements:

Docker
NVIDIA Container Toolkit (Optional but Recommended)
GPU + NVIDIA Cuda Installation (Optional but Recommended)

Step 1: Install Ollama

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:latest

Step 2: Launch Open WebUI with the new features

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Need help setting up Docker and Nvidia Container toolkit?

Configure

OpenWeb UI and Ollama

OpenWebUI provides a sophisticated interface for interacting with locally hosted models while maintaining all the security benefits of local deployment. Key features include:

Intuitive chat interface similar to popular cloud-based AI services
Support for multiple concurrent model instances
Built-in prompt templates and history management
Customizable UI themes and layouts
API integration capabilities for internal applications

Ollama simplifies the process of running AI models locally while providing robust security features:

Easy model installation and version management
Efficient resource utilization through optimized inference
Support for custom model configurations
Built-in model verification and integrity checking
Container-friendly architecture for isolated deployments

by MDF

Deploying Azure Functions with Azure DevOps: 3 Must-Dos! Code Security Included

Azure Functions is a serverless compute service that allows you to run your code in response to various events, without the need to manage any infrastructure. Azure DevOps, on the other hand, is a set of tools and services that help you build, test, and deploy your applications more efficiently. Combining these two powerful tools can streamline your Azure Functions deployment process and ensure a smooth, automated workflow.

In this blog post, we’ll explore three essential steps to consider when deploying Azure Functions using Azure DevOps.

1. Ensure Consistent Python Versions

When working with Azure Functions, it’s crucial to ensure that the Python version used in your build pipeline matches the Python version configured in your Azure Function. Mismatched versions can lead to unexpected runtime errors and deployment failures.

To ensure consistency, follow these steps:

Determine the Python version required by your Azure Function. You can find this information in the requirements.txt file or the host.json file in your Azure Functions project.
In your Azure DevOps pipeline, use the UsePythonVersion task to set the Python version to match the one required by your Azure Function.

yaml

- task: UsePythonVersion@0

inputs:

versionSpec: '3.9'

addToPath: true

Verify the Python version in your pipeline by running python --version and ensuring it matches the version specified in the previous step.

2. Manage Environment Variables Securely

Azure Functions often require access to various environment variables, such as database connection strings, API keys, or other sensitive information. When deploying your Azure Functions using Azure DevOps, it’s essential to handle these environment variables securely.

Here’s how you can approach this:

Store your environment variables as Azure DevOps Service Connections or Azure Key Vault Secrets.
In your Azure DevOps pipeline, use the appropriate task to retrieve and set the environment variables. For example, you can use the AzureKeyVault task to fetch secrets from Azure Key Vault.

yaml

- task: AzureKeyVault@1

inputs:

azureSubscription: 'Your_Azure_Subscription_Connection'

KeyVaultName: 'your-keyvault-name'

SecretsFilter: '*'

RunAsPreJob: false

Ensure that your pipeline has the necessary permissions to access the Azure Key Vault or Service Connections.

3. Implement Continuous Integration and Continuous Deployment (CI/CD)

To streamline the deployment process, it’s recommended to set up a CI/CD pipeline in Azure DevOps. This will automatically build, test, and deploy your Azure Functions whenever changes are made to your codebase.

Here’s how you can set up a CI/CD pipeline:

Create an Azure DevOps Pipeline and configure it to trigger on specific events, such as a push to your repository or a pull request.
In the pipeline, include steps to build, test, and package your Azure Functions project.
Add a deployment task to the pipeline to deploy your packaged Azure Functions to the target Azure environment.

yaml

# CI/CD pipeline

trigger:

- main

pool:
vmImage: ‘ubuntu-latest’steps:
– task: UsePythonVersion@0
inputs:
versionSpec: ‘3.9’
addToPath: true– script: |
pip install -r requirements.txt
displayName: ‘Install dependencies’– task: AzureWebApp@1
inputs:
azureSubscription: ‘Your_Azure_Subscription_Connection’
appName: ‘your-function-app-name’
appType: ‘functionApp’
deployToSlotOrASE: true
resourceGroupName: ‘your-resource-group-name’
slotName: ‘production’

By following these three essential steps, you can ensure a smooth and reliable deployment of your Azure Functions using Azure DevOps, maintaining consistency, security, and automation throughout the process.

Bonus: Embrace DevSecOps with Code Security Checks

As part of your Azure DevOps pipeline, it’s crucial to incorporate security checks to ensure the integrity and safety of your code. This is where the principles of DevSecOps come into play, where security is integrated throughout the software development lifecycle.

Here’s how you can implement code security checks in your Azure DevOps pipeline:

Use Bandit for Python Code Security: Bandit is a popular open-source tool that analyzes Python code for common security issues. You can integrate Bandit into your Azure DevOps pipeline to automatically scan your Azure Functions code for potential vulnerabilities.

yaml

- script: |

pip install bandit

    bandit -r your-functions-directory -f custom -o bandit_report.json

displayName: 'Run Bandit Security Scan'

- task: PublishBuildArtifacts@1

inputs:

PathtoPublish: 'bandit_report.json'

ArtifactName: 'bandit-report'

publishLocation: 'Container'

Leverage the Safety Tool for Dependency Scanning: Safety is another security tool that checks your Python dependencies for known vulnerabilities. Integrate this tool into your Azure DevOps pipeline to ensure that your Azure Functions are using secure dependencies.

yaml

- script: |

pip install safety

    safety check --full-report

displayName: 'Run Safety Dependency Scan'

Review Security Scan Results: After running the Bandit and Safety scans, review the generated reports and address any identified security issues before deploying your Azure Functions. You can publish the reports as build artifacts in Azure DevOps for easy access and further investigation.

By incorporating these DevSecOps practices into your Azure DevOps pipeline, you can ensure that your Azure Functions are not only deployed efficiently but also secure and compliant with industry best practices.

by MDF

The AI-Driven Evolution of Databases

The hype of Artificial Intelligence (AI) and Retrieval-Augmented Generation (RAG) is revolutionizing databases and how they are architected. Traditional database management systems (DBMS) are being redefined to harness the capabilities of AI, transforming how data is stored, retrieved, and utilized. In this article I am sharing some of the shifts happening right now to catch up and create better DBs that can play nice with AI.

1. Vectorization and Embedding Integration

Traditional databases store data in structured formats, typically as rows and columns in tables. However, with the rise of AI, there is a need to store and query high-dimensional data such as vectors (embeddings), which represent complex data types like images, audio, and natural language.

Embedding Vectors: When new data is inserted into the database, it can be vectorized using machine learning models, converting the data into embedding vectors. This allows for efficient similarity searches and comparisons. For example, inserting a new product description could automatically generate an embedding that captures its semantic meaning.
Vector Databases: Specialized vector databases like Pinecone, Weaviate, FAISS (Facebook AI Similarity Search) and Azure AI Search are designed to handle and index vectorized data, enabling fast and accurate similarity searches and nearest neighbor queries.

A great example is PostgreSQL which can be extended to handle high-dimensional vector data efficiently using the pgvector extension. This capability is particularly useful for applications involving machine learning, natural language processing, and other AI-driven tasks that rely on vector representations of data.

What is pgvector?

pgvector is an extension for PostgreSQL that enables the storage, indexing, and querying of vector data. Vectors are often used to represent data in a high-dimensional space, such as word embeddings in NLP, feature vectors in machine learning, and image embeddings in computer vision.

2. Enhanced Indexing Techniques

One of the main changes to support AI is that now your index is required to support ANN (approximate nearest neighbor) queries against vector data. A typical query would be “find me the top N vectors that are most similar to this one”. Each vector may have 100s or 1000s of dimensions, and similarity is based on overall distance across all these dimensions. Your regular btree or hash table index is completely useless for this kind of query, so new types of indexes are provided as part of pgvector on PostgreSQL, or you could use Pinecone, Milvus and many solutions being developed as AI keeps demanding data, these solutions are more specialized for these workloads.

Databases are adopting hybrid indexing techniques that combine traditional indexing methods (B-trees, hash indexes) with AI-driven indexes such as neural hashes and inverted indexes for text and multimedia data.

AI-Driven Indexing: Machine learning algorithms can optimize index structures by predicting access patterns and preemptively loading relevant data into memory, reducing query response times.

What is an Approximate Nearest Neighbor (ANN) Search? It’s an algorithm that finds a data point in a data set that’s very close to the given query point, but not necessarily the absolute closest one. An NN algorithm searches exhaustively through all the data to find the perfect match, whereas an ANN algorithm will settle for a match that’s close enough.

Source: https://www.elastic.co/blog/understanding-ann

3. Automated Data Management and Maintenance

AI-driven databases can automatically adjust configurations and optimize performance based on workload analysis. This includes automatic indexing, query optimization, and resource allocation.

Adaptive Query Optimization: AI models predict the best execution plans for queries by learning from historical data, continuously improving query performance over time.

Predictive Maintenance: Machine learning models can predict hardware failures and performance degradation, allowing for proactive maintenance and minimizing downtime.

Some examples:

Azure SQL Database offers built-in AI features such as automatic tuning, which includes automatic indexing and query performance optimization. Azure DBs also provide insights with machine learning to analyze database performance and recommend optimizations.
Google BigQuery incorporates machine learning to optimize query execution and manage resources efficiently and allows users to create and execute machine learning models directly within the database.
Amazon Aurora utilizes machine learning to optimize queries, predict database performance issues, and automate database management tasks such as indexing and resource allocation. They also integrate machine learning capabilities directly into the database, allowing for real-time predictions and automated adjustments.

Wrap-Up

The landscape of database technology is rapidly evolving, driven by the need to handle more complex data types, improve performance, and integrate seamlessly with machine learning workflows. Innovations like vectorization during inserts, enhanced indexing techniques, and automated data management are at the forefront of this transformation. As these technologies continue to mature, databases will become even more powerful, enabling new levels of efficiency, intelligence, and security in data management.

by MDF

Chips, Servers and Artificial Intelligence

Understanding GPU Evolution and Deployment in Datacenter or Cloud

In the current hype of Artificial Intelligence we find ourselves working a lot more with GPUs which is very fun! GPU workloads can be run on physical servers, VMs, or containers. Each approach has its own advantages and disadvantages.

Physical servers offer the highest performance and flexibility, but they are also the most expensive and complex to manage. VMs provide a good balance between performance, flexibility, resource utilization, and cost. Containers offer the best resource utilization and cost savings, but they may sacrifice some performance and isolation.

For Cloud we often recommend Containers since there are services where you can spin up, run your jobs and spin down. A Physical VM running 24-7 on cloud services can be expensive. The performance trade-off is often solved by distributing load or development.

The best deployment option for your GPU workloads will depend on your specific needs and requirements. If you need the highest possible performance and flexibility, physical servers are the way to go. If you are looking to improve resource utilization and cost savings, VMs or containers may be a better option.

Deployment	Description	Benefits	Disadvantages
Physical servers	Each GPU is directly dedicated to a single workload.	Highest performance and flexibility.	Highest cost and complexity.
VMs	GPUs can be shared among multiple workloads.	Improved resource utilization and cost savings.	Reduced performance and flexibility.
Containers	GPUs can be shared among multiple workloads running in the same container host.	Improved resource utilization and cost savings.	Reduced performance and isolation.

Additional considerations:

GPU virtualization: GPU virtualization technologies such as NVIDIA vGPU and AMD SR-IOV allow you to share a single physical GPU among multiple VMs or containers. This can improve resource utilization and reduce costs, but it may also reduce performance.
Workload type: Some GPU workloads are more sensitive to performance than others. For example, machine learning and artificial intelligence workloads often require the highest possible performance. Other workloads, such as video encoding and decoding, may be more tolerant of some performance degradation.
Licensing: Some GPU software applications are only licensed for use on physical servers. If you need to use this type of software, you will need to deploy your GPU workloads on physical servers.

Overall, the best way to choose the right deployment option for your GPU workloads is to carefully consider your specific needs and requirements.

Here is a table summarizing the key differences and benefits of each deployment options:

Characteristic	Physical server	VM	Container
Performance	Highest	High	Medium
Flexibility	Highest	High	Medium
Resource utilization	Medium	High	Highest
Cost	Highest	Medium	Lowest
Complexity	Highest	Medium	Lowest
Isolation	Highest	High	Medium

by MDF

Containers for Data Scientists on top of Azure Container Apps

The Azure Data Science VMs are good for dev and testing and even though you could use a virtual machine scale set, that is a heavy and costly solution.

When thinking about scaling, one good solution is to containerize the Anaconda / Python virtual environments and deploy them to Azure Kubernetes Service or better yet, Azure Container Apps, the new abstraction layer for Kubernetes that Azure provides.

Here is a quick way to create a container with Miniconda 3, Pandas and Jupyter Notebooks to interface with the environment. Here I also show how to deploy this single test container it to Azure Container Apps.

The result:

A Jupyter Notebook with Pandas Running on Azure Container Apps.

Container Build

If you know the libraries you need then it would make sense to start with the lightest base image which is Miniconda3, you can also deploy the Anaconda3 container but that one might have libraries you might never use that might create unnecessary vulnerabilities top remediate.

Miniconda 3: https://hub.docker.com/r/continuumio/miniconda3

Anaconda 3: https://hub.docker.com/r/continuumio/anaconda3

Below is a simple dockerfile to build a container with pandas, openAi and tensorflow libraries.

FROM continuumio/miniconda3
RUN conda install jupyter -y --quiet && \ mkdir -p /opt/notebooks
WORKDIR /opt/notebooks
RUN pip install pandas
RUN pip install openAI
RUN pip install tensorflow
CMD ["jupyter", "notebook", "--ip='*'", "--port=8888", "--no-browser", "--allow-root"]

Build and Push the Container

Now that you have the container built push it to your registry and deploy it on Azure Container Apps. I use Azure DevOps to get the job done.

Here’s the pipeline task:
- task: Docker@2 inputs: containerRegistry: 'dockerRepo' repository: 'm05tr0/jupycondaoai' command: 'buildAndPush' Dockerfile: 'dockerfile' tags: | $(Build.BuildId) latest

Deploy to Azure ContainerApps

Deploying to Azure Container Apps was painless, after understanding the Azure DevOps task, since I can include my ingress configuration in the same step as the container. The only requirement I had to do was configure DNS in my environment. The DevOps task is well documented as well but here’s a link to their official docs.

Architecture / DNS: https://learn.microsoft.com/en-us/azure/container-apps/networking?tabs=azure-cli

Azure Container Apps Deploy Task : https://github.com/microsoft/azure-pipelines-tasks/blob/master/Tasks/AzureContainerAppsV1/README.md

A few things I’d like to point out is that you don’t have to provide a username and password for the container registry the task gets a token from az login. The resource group has to be the one where the Azure Container Apps environment lives, if not a new one will be created. The target port is where the container listens on, see the container build and the jupyter notebooks are pointing to port 8888. If you are using the Container Apps Environment with a private VNET, setting the ingress to external means that the VNET can get to it not outside traffic from the internet. Lastly I disable telemetry to stop reporting.

task: AzureContainerApps@1 inputs: azureSubscription: 'IngDevOps(XXXXXXXXXXXXXXXXXXXX)' acrName: 'idocr' dockerfilePath: 'dockerfile' imageToBuild: 'idocr.azurecr.io/m05tr0/jupycondaoai' imageToDeploy: 'idocr.azurecr.io/m05tr0/jupycondaoai' containerAppName: 'datasci' resourceGroup: 'IDO-DataScience-Containers' containerAppEnvironment: 'idoazconapps' targetPort: '8888' location: 'East US' ingress: 'external' disableTelemetry: true

After deployment I had to get the token which was easy with the Log Stream feature under Monitoring. For a deployment of multiple Jupyter Notebooks it makes sense to use JupyterHub.

by MDF

Azure Open AI: Private and Secure "ChatGPT like" experience for Enterprises.

Azure provides the OpenAI service to address the concerns for companies and government agencies that have strong security regulations but want to leverage the power of AI as well.

Most likely you’ve used one of the many AI offerings out there. Open AI’s ChatGPT, Google Bard, Google PaLM with MakerSuite, Perplexity AI, Hugging Chat and many more have been in the latest hype and software companies are racing to integrate them into their products. The main way is to buy a subscription and connect to the ones that offer their API over the internet but as an DevSecOps engineer here’s where the fun starts.

A lot of companies following good security practices block traffic to and from the internet so the first part of all this will be to open the firewall. Next you must protect the credentials of the API user so that it doesn’t get hacked and access will reveal what you are up to. Then you have to trust that OpenAI is not using your data to train their models and that they are keeping your company’s data safe.

It could take a ton of time to plan, design and deploy a secured infrastructure for using large language models and unless you have a very specific use case it might be overkill to build your own.

Here’s a breakdown of a few infrastructure highlights about this service.

3 Main Features

Privacy and Security

Your Chat-GPT like interface called Azure AI Studio runs in your private subscription. It can be linked to one of your VNETs so that you can use internal routing and you can also add private endpoints so that you don’t even have to use it over the internet.

Even if you have to use it over the internet you can lock it down to only allow your public IPs and your developers will need a token for authentication as well that can be scripted to rotate every month.

Pricing

Common Models

GPT-4 Series: The GPT-4 models are like super-smart computers that can understand and generate human-like text. They can help with things like understanding what people are saying, writing stories or articles, and even translating languages.

Key Differences from GPT-3:
- Model Size: GPT-4 models tend to be larger in terms of parameters compared to GPT-3. Larger models often have more capacity to understand and generate complex text, potentially resulting in improved performance.
- Training Data: GPT-4 models might have been trained on a more extensive and diverse dataset, potentially covering a broader range of topics and languages. This expanded training data can enhance the model’s knowledge and understanding of different subjects.
- Improved Performance: GPT-4 models are likely to demonstrate enhanced performance across various natural language processing tasks. This improvement can include better language comprehension, generating more accurate and coherent text, and understanding context more effectively.
- Fine-tuning Capabilities: GPT-4 might introduce new features or techniques that allow for more efficient fine-tuning of the model. Fine-tuning refers to the process of training a pre-trained model on a specific dataset or task to make it more specialized for that particular use case.
- Contextual Understanding: GPT-4 models might have an improved ability to understand context in a more sophisticated manner. This could allow for a deeper understanding of long passages of text, leading to more accurate responses and better contextual awareness in conversation.
GPT-3 Base Series: These models are also really smart and can do similar things as GPT-4. They can generate text for writing, help translate languages, complete sentences, and understand how people feel based on what they write.
Codex Series: The Codex models are designed for programming tasks. They can understand and generate computer code. This helps programmers write code faster, get suggestions for completing code, and even understand and improve existing code.
Embeddings Series: The Embeddings models are like special tools for understanding text. They can turn words and sentences into numbers that computers can understand. These numbers can be used to do things like classify text into different categories, find information that is similar to what you’re looking for, and even figure out how people feel based on what they write.

Getting Access to it!

Although the service is Generally Available (GA) it is only available in East US and West Europe. You also have to submit an application so that MS can review your company and use case so they can approve or deny your request. This could be due to capacity and for Microsoft to gather information on how companies will be using the service.

The application is here: https://aka.ms/oai/access

Based on research and experience getting this for my clients I always recommend only pick what you initially need and not get too greedy. It would be also wise to speak with your MS Rep and take them out for a beer! For example if you just need code generation then just select the codex option.

Lately getting the service has been easier to get, hopefully soon we won’t need the form and approval dance.

by MDF

Easiest Way to Deploy Ubuntu 20.04 with NVIDIA Drivers and the Latest CUDA toolkit via Packer.

I am building an analytics system that deploys containers on top of the Azure NCasT4_v3-series virtual machines which are powered by Nvidia Tesla T4 GPUs and AMD EPYC 7V12(Rome) CPUs. I am deploying the VM from an Azure DevOps pipeline using Hashicorp Packer and after trying a few ways I found a very easy way to deploy the VM, Driver and Cuda Toolkit which I will share in this article.

by MDF

OpenWebUI & Ollama: Experience AI on Your Terms with Local Hosting

Why use local LLMs? #1 Uncensored Models

Why use local LLMs? #2 Privacy

Implementation

Need help setting up Docker and Nvidia Container toolkit?

OpenWeb UI and Ollama

Deploying Azure Functions with Azure DevOps: 3 Must-Dos! Code Security Included

1. Ensure Consistent Python Versions

2. Manage Environment Variables Securely

3. Implement Continuous Integration and Continuous Deployment (CI/CD)

Bonus: Embrace DevSecOps with Code Security Checks

The AI-Driven Evolution of Databases

1. Vectorization and Embedding Integration

What is pgvector?

2. Enhanced Indexing Techniques

3. Automated Data Management and Maintenance

Wrap-Up

Understanding GPU Evolution and Deployment in Datacenter or Cloud

Here is a table summarizing the key differences and benefits of each deployment options:

Containers for Data Scientists on top of Azure Container Apps

The result:

A Jupyter Notebook with Pandas Running on Azure Container Apps.

Container Build

Build and Push the Container

Deploy to Azure ContainerApps

Azure Open AI: Private and Secure "ChatGPT like" experience for Enterprises.

3 Main Features

Privacy and Security

Pricing

Common Models

Key Differences from GPT-3:

Getting Access to it!

Easiest Way to Deploy Ubuntu 20.04 with NVIDIA Drivers and the Latest CUDA toolkit via Packer.