Private AI for All with Control and Customisation


Many clients have approached us seeking the hottest AI solutions to enhance their businesses. They often look for a versatile chatbot similar to ChatGPT with features like content verification, image generation modifications, process integration, and data summarization. Common implementations involve using APIs from AI vendors such as OpenAI, Claude, and POE. However, data privacy, cost, and control are significant concerns, particularly when clients are uncertain about the AI's capabilities and anticipate evolving business requirements.

A safe and accessible locally hosted private GPT within your premise

This interest has led us to explore Private AI, where clients can host their AI models securely within their infrastructure. This ensures control over data privacy, security and cost.

From our discussions with clients, there is a strong interest in chat-based AI since the rise of ChatGPT. Its intuitive chat capability allows staff to provide simple text commands and have the AI execute appropriate actions.

Our tech experts have thoroughly investigated Private AI and deployed a working GPT using Llama 3 (from Meta) without relying on external APIs. In this article, we will discuss how this is achieved and the custom features available with a private AI/GPT.

Below is a privately hosted ZoarGPT https://llm.zoar.io, available only during HK office hours to manage load and cost. Please contact us for a dedicated demo.

If it is online, feel free to sign-up or use our test account "testaccount" with the password "testaccount".


Checking Zoar GPT Status...

Loading...

Our hardware specification for hosting ZoarGPT

  
Hardware: A basic gaming stack with Nvidia GeForce GTX 4060 8gb RAM, i7 CPU 16 Cores. 32 GB RAM.

OS: Ubuntu

Modules: Ollama, OpenWebUI

LLM Models tested: Llama 3 8.0B (by Meta), Qwen 2 7.0B (Alibaba Cloud)

Analysis - How it Works

  • Ollama: This platform allows you to work with available LLMs and serve them as an API. Ollama can download and install LLMs locally.
  • OpenWebUI: A self-hosted chat WebUI application that integrates with LLM APIs, such as Ollama.

Response Time Running LLM Locally

With our hardware specifications, it takes a few seconds to generate a response, averaging ~80% load on the GPU. OpenWebUI supports document summarization and image processing (generative), which would increase tokenization load and consequently, response time.


GPU being utilized while ZoarGPT generating reply.

Different LLM Models on Ollama

We tested Llama 3 (8B) and Qwen 2 (7B). The "B" indicates the number of parameters (in billions) in the model. Generally, more parameters lead to better AI performance but at the cost of more processing power. Both models we downloaded are around ~4GB in size. In comparision, a Llama 3 LLM with 70B parameters has a size of ~40GB. For more information, model parameter distributions are available on the Ollama website.

Differences in Models

Our tests indicate that different models excel in various tasks. For instance, Qwen 2, trained with Chinese data and other languages, offers better multilingual support. See below test queries sent to both models and the replies.

Qwen2 generate content in Chinese, Llama3 fails to do so.

Similar content generated for coding.

Application and Conclusion

A Private GPT allows you to deploy customized, scalable solutions. For example, you can implement a small, simple GPT for tasks like checking and booking annual leave within the company. A smaller parameter size GPT can handle basic requests efficiently. If you have a diverse workforce, consider using a model like Qwen for better multilingual support. The potential is immense.

Hosting a Private AI within your company infrastructure is feasible, with hardware costs fully controlled and scalable based on the number of users. Additionally, a locally hosted AI enables you to customize the LLM with specific prompts, training data, and operational boundaries. We will cover Private AI customizations in a future article. For more information, feel free to contact Zoar.


Next Article.