Run a Local AI Chat Assistant Completely Offline (No Internet Needed) ![]()
Unlock the full power of AI chat assistants by running them locally—completely offline, with no need for cloud services like OpenAI or Claude. This guide walks you through setting up an offline AI assistant using Open WebUI and Ollama, ensuring privacy, full control, and freedom from internet dependency.
What You’ll Set Up
- Open WebUI – A sleek web interface for LLMs
- Ollama – Local LLM model runner (like llama3.2, llama3.2-vision)
- Docker – Containerized environment to isolate your setup
System Requirements
Hardware:
- CPU: 4+ core processor
- RAM: Minimum 8GB, ideally 16GB+
- Disk: 10GB+ free space + space per model
- GPU (optional): NVIDIA CUDA / AMD ROCm for faster inference
Software:
- OS: Linux (Ubuntu 20.04+), macOS (M1/M2+), or Windows 10/11 (WSL2)
- Docker: https://www.docker.com/
- Ollama: https://ollama.com/
- Browser: Chrome, Firefox, Safari, Edge
Setup Instructions
1. Install Dependencies
Install both Docker and Ollama, then verify Ollama is working:
curl http://localhost:11434/api/version
2. Run Open WebUI via Docker:
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Access it at: http://localhost:3000
Create Admin Account
Once WebUI loads, set up your admin profile. Enter your email, name, and password.
Download and Use a Model
You’ll need to pull a model before chatting:
ollama pull llama3.2
OR via WebUI:
- Click “Select a model”
- Search and pull “llama3.2” or “llama3.2-vision”
Once downloaded, select the model to activate your assistant.
Choosing the Right Model
| Model | Size | Best Use | Notes |
|---|---|---|---|
| llama3.2 | ~4GB | Chat, code, writing | Text-only |
| llama3.2-vision | ~8GB | Image + Text | Needs more RAM/VRAM |
Interact with the Chatbot
Start chatting! Ask questions, solve problems, or write content—all locally.
Example prompts:
- “Explain quantum computing in simple terms.”
- “What is the capital of Indonesia?”
- “Summarize this paragraph.”
Troubleshooting Tips
-
Docker container not starting?
- Check port usage:
lsof -i :3000 - Restart Docker:
systemctl restart docker
- Check port usage:
-
Ollama not connecting?
- Verify:
curl http://localhost:11434/api/version - Test inside Docker:
docker exec open-webui curl http://host.docker.internal:11434/api/version
- Verify:
Advanced Features
Use Your Own Documents (RAG)
Add PDFs, DOCX, or MD files to your local KB and configure:
{
"rag_enabled": true,
"chunk_size": 500,
"chunk_overlap": 50,
"document_lang": "en"
}
Web Search with SearXNG
Run this command:
docker run -d --name searxng -p 8080:8080 \
-v searxng-data:/etc/searxng searxng/searxng
Then enable and configure in Open WebUI settings.
Role-Based Access Control
Assign user roles:
| Role | Permissions |
|---|---|
| Admin | Full system access |
| Power User | Model & RAG mgmt |
| Basic User | Chat only |
Multimodal Input Support
Use llama3.2-vision for image-based queries.
ollama pull llama3.2-vision
Upload an image and prompt:
“What’s happening in this photo?”
Open WebUI will return both image and text analysis.
Conclusion
With Open WebUI + Ollama, you can:
- Use LLMs offline
- Ensure privacy and data control
- Explore custom AI applications
Resources & Official Links
- Open WebUI
- Ollama
- Docker
- SearXNG
- Ubuntu
- Windows WSL2 Setup
- macOS Homebrew
- NVIDIA CUDA Toolkit
- AMD ROCm
- Openverse Image Source
- GitHub – open-webui/open-webui
- GitHub – ollama/ollama
- curl Documentation
- Systemd Docs
- Visual Studio Code (for developers)
- JSON Validator
- Markdown Guide
This method offers a full-stack AI assistant experience without ever connecting to the cloud—ideal for developers, researchers, and privacy-conscious users.

!