Open WebUI And Ollama On Proxmox
Run Open WebUI and Ollama in Proxmox LXC containers, with the combined all-in-one path first and the dedicated Ollama server pattern as the alternative when the lab grows up a little.
Published December 6, 2024 · Updated January 24, 2025
Open WebUI And Ollama On Proxmox
This page owns one very specific Proxmox path: getting a usable local AI stack online with Open WebUI as the human-facing layer and Ollama as the inference engine.
That can mean one combined container when you want the shortest path to a working system. It can also mean splitting Ollama into its own inference node once the lab starts to care more about GPU discipline than convenience.
If the lab wants to finish that split and move Open WebUI into its own lightweight frontend container, continue to Open WebUI Standalone Frontend On Proxmox.
If you want the general fast path first, start with Proxmox Helper Scripts. This page is the deeper runbook for the exact Open WebUI and Ollama shape I run on Proxmox.
The host-side GPU work still belongs in GPU Passthrough On Proxmox. The broader maintenance rhythm still belongs in Update And Maintenance. This page is the workload runbook that sits on top of those decisions.
Combined Container: Open WebUI Plus Ollama
The first deployment pattern is the simplest one to live with.
You create a single LXC, let the community script install Open WebUI, accept the Ollama prompt, and end up with one box that handles both the frontend and the local model runtime. That matches Open WebUI's own split between bundled-Ollama deployments and deployments that point at a separate Ollama server later.1
Create LXC Container
Container Sizing
| Setting | Value | Rationale |
|---|---|---|
| Container ID | 100 | First AI container |
| Hostname | openwebui-ollama | Reflects combined deployment role |
| Disk Size | 150 GB | Model storage (~4–30 GB per model) + app + Python deps |
| CPU Cores | 12 | Tokenization, prompt processing, CPU-offloaded layers |
| RAM | 20000 MiB | VRAM spill for large models + Ollama + OpenWebUI overhead |
| OS | Debian 13 | Default (recommended) |
| Bridge | vmbr0 | Same network as other containers |
| IPv4 | 192.168.50.40/24 | Static IP for Cloudflare tunnel / bookmarks |
| Gateway | 192.168.50.1 | Router |
| GPU Passthrough | Yes | Required — Ollama performs GPU inference in this container |
| Nesting | Enabled | Required for systemd in Debian 13 |
Pre-configure App Defaults (Skip the Wizard)
The Community-Scripts framework exposes a 28-step Advanced install path, but it also supports per-app defaults files at /usr/local/community-scripts/defaults/<app>.vars. If openwebui.vars exists, the script adds an App Defaults menu entry so you can build from that file instead of stepping through the full wizard.23
For one-off runs, you can also pass var_* values directly on the command line. The file-based path is better when you want the container shape to stay repeatable across rebuilds.23
Create the App Defaults File
On the Proxmox VE host shell:
# Create the directory (if it doesn't exist)
mkdir -p /usr/local/community-scripts/defaults
# Write the vars file
cat > /usr/local/community-scripts/defaults/openwebui.vars << 'EOF'
# OpenWebUI + Ollama (Combined Deployment) - App Defaults
# GPU enabled, Ollama will be installed alongside OpenWebUI
var_cpu=16
var_ram=30000
var_disk=400
var_unprivileged=1
var_gpu=yes
var_brg=vmbr0
var_net=192.168.50.40/24
var_gateway=192.168.50.1
var_hostname=openwebui-ollama
var_os=debian
var_version=13
var_ssh=yes
var_nesting=1
var_protection=yes
var_tags=ai;inference
var_timezone=Australia/Melbourne
var_container_storage=local-zfs
var_template_storage=local
EOFAvailable Variables Reference
| Variable | Description | Example |
|---|---|---|
var_cpu | CPU cores | 12 |
var_ram | RAM in MiB | 20000 |
var_disk | Disk in GB | 150 |
var_unprivileged | 1 = unprivileged, 0 = privileged | 1 |
var_gpu | GPU passthrough | yes / no |
var_brg | Network bridge | vmbr0 |
var_net | Static IP (CIDR) | 192.168.50.45/24 |
var_gateway | Default gateway | 192.168.50.1 |
var_hostname | Container hostname | openwebui-ollama |
var_os | OS template | debian |
var_version | OS version | 13 |
var_ssh | Enable SSH | yes / no |
var_nesting | Enable nesting (for systemd) | 1 |
var_protection | Prevent accidental deletion | yes |
var_tags | Proxmox tags (semicolon-separated) | ai;inference |
var_timezone | Timezone | Australia/Melbourne |
var_container_storage | Storage pool for container | local-zfs |
var_template_storage | Storage pool for templates | local |
var_vlan | VLAN tag | 50 |
var_mtu | MTU size | 1500 |
Run the Community Script
On the Proxmox VE host shell:
bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/openwebui.sh)"With openwebui.vars in place, the current menu exposes Default Install, Advanced Install, User Defaults, App Defaults for Open WebUI, and Settings. Select App Defaults so the container uses the values you already wrote.23
When prompted "Would you like to add Ollama?" → type Y (accept).
A successful run typically ends with output similar to:
Using App Defaults for Open WebUI
LXC Container 100 was successfully created.
Detected NVIDIA GPU
Found 6 NVIDIA device(s) for passthrough
NVIDIA GPU passthrough configured (6 devices)
Installed Open WebUI
Would you like to add Ollama? <y/N> y
Created Service
Completed successfully!
http://192.168.50.40:42Alternative: One-Shot Environment Variables
For a single run without modifying any files, pass everything inline:
var_cpu=12 var_ram=20000 var_disk=150 var_gpu=yes var_hostname=openwebui-ollama \
var_net=192.168.50.40/24 var_gateway=192.168.50.1 var_nesting=1 \
bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/openwebui.sh)"Environment variables have the highest priority — they override both .vars files and built-in defaults without modifying any files. You still need to select a menu option, but all values are pre-populated.
Environment variables are useful for one-off runs when you do not want to persist a dedicated app defaults file.23
What the Script Installs
The current community script installs Open WebUI as a uv tool, creates a systemd open-webui.service, and stores its persistent data under /root/.open-webui with optional environment overrides in /root/.env.3
- Python 3.12 (via
uv) - Open WebUI — web interface on port 8080
- Ollama — LLM inference server (if selected)
- systemd services —
open-webui.serviceandollama.service - Config file:
/root/.env - Data directory:
/root/.open-webui
Do NOT shut down the container yet — GPU passthrough must be configured first.
Configure GPU Passthrough
The source notes are right to treat this as the moment where convenience ends and correctness begins.
If the GPU wiring is wrong, the rest of the stack becomes a very elaborate CPU box.
For the host-side explanation and the full passthrough theory, use GPU Passthrough On Proxmox. The commands below are the workload-specific checks and fallback steps.
The current Community-Scripts build framework can detect Intel, AMD, and NVIDIA GPU device nodes and append dev[n] entries automatically when GPU passthrough is enabled, but you should still verify the resulting container config instead of assuming detection succeeded.2
If GPU Passthrough Didn't Auto-Configure
Check container config:
cat /etc/pve/lxc/100.conf | grep "dev[0-9]:"You should see entries like:
dev0: /dev/nvidia0
dev1: /dev/nvidiactl
dev2: /dev/nvidia-uvm
...If missing, edit the container config to add GPU device passthrough using the Proxmox native syntax:
nano /etc/pve/lxc/100.confAdd at the end:
# GPU device passthrough (Proxmox native syntax)
dev0: /dev/nvidia0
dev1: /dev/nvidiactl
dev2: /dev/nvidia-uvm
dev3: /dev/nvidia-uvm-tools
dev4: /dev/nvidia-caps/nvidia-cap1
dev5: /dev/nvidia-caps/nvidia-cap2Save and exit, then restart:
pct stop 100
pct start 100Important: Use
pct stop/pct start(full cycle), notpct reboot. A reboot does not re-read updated LXC config.
Install NVIDIA Driver in Container
The container shares the host kernel, so only userspace libraries are needed in the container itself. On the application side, Ollama's Linux docs still treat nvidia-smi as the verification point that the CUDA-capable driver stack is available.4
Push Driver from Host
# On the Proxmox host — push the .run file downloaded during host GPU setup
pct push 100 /tmp/NVIDIA-Linux-x86_64-580.126.09.run /tmp/NVIDIA-Linux-x86_64-580.126.09.runInstall Inside Container
# Enter the container
pct enter 100
# Make executable and install (compute-only, no display libraries)
chmod +x /tmp/NVIDIA-Linux-x86_64-580.126.09.run
./NVIDIA-Linux-x86_64-580.126.09.run \
--no-kernel-module \
--no-opengl-files| Flag | Why |
|---|---|
--no-kernel-module | Host provides the kernel module |
--no-opengl-files | No display server in container |
Verify GPU Access
nvidia-smiExpected: GPU name, driver version, memory, and temperature displayed.
exit # Return to host shellVerify And Access
Confirm Ollama Uses GPU
pct exec 100 -- ollama run llama3.2First run downloads the model. Once loaded, check GPU utilisation:
# On the host (GPU monitoring)
nvtopYou should see the Ollama process using GPU memory.
Access Open WebUI
Open in browser:
http://<container-IP>:8080To find the container IP:
pct exec 100 -- hostname -IOn first visit, create an admin account — this is local to the container. On first visit, create an admin account — Open WebUI's first account becomes the administrator for that instance.1
Choose A Secure Exposure Path
Open WebUI should not stay on raw HTTP once other people, remote devices, or browser-integrated features start depending on it.
The detailed decision tree now lives in Secure Service Exposure On Proxmox. Use that subsection to choose one exposure model for the lab instead of letting every workload page grow its own slightly different HTTPS advice.
- Cloudflare Tunnel On Proxmox - the shortest secure path when you want outbound-only exposure, no port forwarding, and Cloudflare Access in front of sensitive services.
- Nginx Reverse Proxy LXC On Proxmox - the self-hosted reverse-proxy path when you want wildcard certificates and one front door for multiple services.
- Individual Certificates On Proxmox With acme.sh - the tighter certificate-isolation path when each service deserves its own certificate lifecycle.
If the lab already exposes several services, make the exposure decision once in that subsection and keep this page focused on the Open WebUI and Ollama workload itself.
Management Commands
Service Control (Inside Container)
# Check status
systemctl status open-webui
systemctl status ollama
# Restart services
systemctl restart ollama
systemctl restart open-webui
# View logs
journalctl -u ollama -f
journalctl -u open-webui -fContainer Control (From Host)
# Start / Stop / Restart
pct start 100
pct stop 100
pct restart 100
# Enter container shell
pct enter 100
# Quick command execution
pct exec 100 -- nvidia-smi
pct exec 100 -- ollama listPull Models
# From host
pct exec 100 -- ollama pull llama3.2
pct exec 100 -- ollama pull codellama
# List installed models
pct exec 100 -- ollama listUpdate Open WebUI
Open WebUI upstream recommends backing up data before upgrades and pinning a release tag when stability matters. The Community-Scripts Open WebUI installer currently manages the app with uv, so use that same mechanism when you update the container.35
# From Proxmox host — take a snapshot first
pct snapshot 100 pre-openwebui-update --description "Before Open WebUI update"
# Enter the container
pct enter 100
# Stop the service
systemctl stop open-webui.service
# Match the community-scripts install method
~/.local/bin/uv tool install --force --python 3.12 --constraint <(echo "numba>=0.60") open-webui[all]
# Start the service again
systemctl start open-webui.service
# Verify version and health
curl -s http://localhost:8080/api/version
systemctl status open-webui.serviceAccess Open WebUI in your browser at http://192.168.50.40:8080 to confirm the update was successful.
Troubleshooting
Ollama Command Not Found
If the helper script completes but ollama isn't available:
pct enter 100
# Install Ollama manually
curl -fsSL https://ollama.com/install.sh | sh
# Start and verify
systemctl enable --now ollama
# Verify
ollama -vThe helper script sometimes skips Ollama installation if certain dependencies are missing or network issues occur during the install phase.
Alternative Deployment: Ollama As A Dedicated Inference Server
The source guide also kept an alternate shape for the moment when the combined container starts feeling too convenient for its own good.
If Open WebUI and Ollama deserve different lifecycles, different memory budgets, or cleaner GPU ownership, split them.
That split does not have to stop at the inference layer. If the next step is moving the browser UI out as well, use Open WebUI Standalone Frontend On Proxmox for the frontend-only container and migration flow.
Container Sizing (Standalone)
| Setting | Value | Rationale |
|---|---|---|
| Container ID | 101 (repurposed) or next available | Reuse existing GPU-configured container |
| Hostname | ollama | Reflects inference-only role |
| Disk Size | 150 GB | Model storage (~4–30 GB per model in ~/.ollama/models/) |
| CPU Cores | 8–12 | Tokenization, prompt processing, CPU-offloaded layers |
| RAM | 24000–32000 MiB | VRAM spill for large models + Ollama overhead |
| OS | Debian 13 | Consistent with existing setup |
| Bridge | vmbr0 | Same network as OpenWebUI |
| IPv4 | 192.168.50.40/24 | Standard Ollama IP (OpenWebUI keeps .30) |
| Gateway | 192.168.50.1 | Router |
| GPU Passthrough | Yes | The whole point — GPU inference |
| Nesting | Enabled | Required for systemd in Debian 13 |
Model VRAM Reference
| Model | Parameters | Quantisation | VRAM Required |
|---|---|---|---|
| Llama 3.2 | 3B | Q4_K_M | ~2 GB |
| Llama 3.1 | 8B | Q4_K_M | ~5 GB |
| Llama 3.1 | 8B | Q8_0 | ~9 GB |
| DeepSeek-R1 | 14B | Q4_K_M | ~9 GB |
| Qwen 2.5 | 32B | Q4_K_M | ~20 GB |
| Llama 3.1 | 70B | Q4_K_M | ~40 GB (spills to RAM) |
Rule of thumb: If the model fits in 24 GB VRAM, it runs at full GPU speed. If it exceeds VRAM, Ollama automatically offloads layers to CPU/RAM — this works but inference is slower for the CPU-offloaded layers.
Option A: Repurpose Existing CT 100
pct enter 100
# Stop and disable OpenWebUI
systemctl disable --now open-webui.service
# Verify it's stopped
systemctl status open-webui
# Expected: inactive (dead), disabledClean Up OpenWebUI (Optional — Frees Disk Space)
# Remove OpenWebUI binary and Python environment
# Try uv first (may not be in PATH depending on script version)
uv tool uninstall open-webui 2>/dev/null || \
~/.local/bin/uv tool uninstall open-webui 2>/dev/null || \
echo "uv not found — removing files manually"
# Remove uv-managed tool installs and cache
rm -rf /root/.local/share/uv /root/.local/bin/open-webui
# Remove data (ONLY after migrating to the new OpenWebUI container!)
rm -rf /root/.open-webui
# Remove env file if no longer needed
# (Keep it if it contains Ollama-relevant settings)
rm /root/.envChange the IP Address
From the Proxmox host:
# Stop the container
pct stop 100
# Update the network config
pct set 100 -net0 name=eth0,bridge=vmbr0,ip=192.168.50.40/24,gw=192.168.50.1
# Update the hostname
pct set 100 -hostname ollama
# Resize resources (optional — give more RAM for inference)
pct set 100 -memory 24000
# CPU cores can stay at 12 or adjust:
# pct set 100 -cores 12
# Start the container
pct start 100Verify inside the container:
pct enter 100
hostname -I
# Expected: 192.168.50.40Option B: Create A New Container
On the Proxmox VE host shell:
# Back up the existing OpenWebUI vars (if any)
cp /usr/local/community-scripts/defaults/openwebui.vars \
/usr/local/community-scripts/defaults/openwebui.vars.bak 2>/dev/null
# Create the directory (if it doesn't exist)
mkdir -p /usr/local/community-scripts/defaults
# Write Ollama-specific vars
cat > /usr/local/community-scripts/defaults/openwebui.vars << 'EOF'
# Ollama Standalone (GPU Inference Server) - App Defaults
# GPU enabled, Ollama will be installed, OpenWebUI disabled after install
var_cpu=12
var_ram=24000
var_disk=150
var_unprivileged=1
var_gpu=yes
var_brg=vmbr0
var_net=192.168.50.40/24
var_gateway=192.168.50.1
var_hostname=ollama
var_os=debian
var_version=13
var_ssh=yes
var_nesting=1
var_protection=yes
var_tags=ai;inference
var_timezone=Australia/Melbourne
var_container_storage=local-zfs
var_template_storage=local
EOFRun the script again:
bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/openwebui.sh)"Select App Defaults so the container uses the file you just wrote.23
When prompted "Would you like to add Ollama?" → type Y (accept).
Restore the earlier OpenWebUI vars if you backed them up:
mv /usr/local/community-scripts/defaults/openwebui.vars.bak \
/usr/local/community-scripts/defaults/openwebui.varsDisable the OpenWebUI service after the install:
pct enter <CT-ID>
systemctl disable --now open-webui.serviceConfigure Network Listening (Standalone)
If Open WebUI will connect to Ollama over the network instead of through localhost inside the same container, add a systemd override for OLLAMA_HOST and any other runtime tuning you want to keep across restarts. Ollama's Linux docs use systemctl edit ollama for this kind of customization.6
pct enter 100
sudo systemctl edit ollama.serviceAdd:
[Service]
Environment="OLLAMA_HOST=0.0.0.0:58008"
Environment="OLLAMA_FLASH_ATTENTION=1"
Environment="OLLAMA_KV_CACHE_TYPE=q8_0"
Environment="OLLAMA_KEEP_ALIVE=5m"Apply and verify:
sudo systemctl daemon-reload
sudo systemctl restart ollama
ss -tlnp | grep 58008Expected:
LISTEN 0 4096 0.0.0.0:58008 0.0.0.0:* users:(("ollama",pid=...,fd=...))If it shows 127.0.0.1 instead of 0.0.0.0, the override didn't apply. Check:
systemctl cat ollama.serviceVariable Reference
| Variable | Default | Recommended | Effect |
|---|---|---|---|
OLLAMA_HOST | 127.0.0.1:11434 | 0.0.0.0:58008 | Listen on all interfaces for remote access on non-standard port |
OLLAMA_FLASH_ATTENTION | 0 | 1 | Enables Flash Attention — significantly reduces memory at large context sizes. No quality impact. |
OLLAMA_KV_CACHE_TYPE | f16 | q8_0 | Quantises the KV cache to 8-bit — halves KV cache memory with minimal quality loss. Allows larger context windows or bigger models. |
OLLAMA_KEEP_ALIVE | 5m | 5m or -1 | How long models stay in VRAM after last request. 5m = unload after 5 min idle (good when sharing GPU with llama.cpp). -1 = never unload (good for dedicated Ollama use). |
OLLAMA_NUM_PARALLEL | 1 | 1–4 | Max parallel requests per model. Higher values increase context memory proportionally. |
OLLAMA_MAX_LOADED_MODELS | 3 × num_GPUs | 1 | Max models loaded concurrently. Set to 1 if VRAM is tight (single RTX 3090). |
OLLAMA_ORIGINS | 127.0.0.1,0.0.0.0 | Add OpenWebUI origin if needed | CORS allowed origins (browser-level only, not a security boundary). |
Related Topics
- GPU Passthrough On Proxmox — the host-side NVIDIA and LXC groundwork this stack depends on.
- Update And Maintenance — where the lab-wide update order and GPU driver version discipline live.
- Monitoring And Alerts — the place to watch GPU health and service behavior once this stack is in use.
- Container Network Throttling — handy when model downloads or package pulls should keep moving without flattening the rest of the network.
- Open WebUI Standalone Frontend On Proxmox — the follow-on page when inference stays remote and the UI deserves its own lightweight guest.
Footnotes
-
Open WebUI documents both the bundled-Ollama path and the "Ollama on a different server" path in its quick start guide, and notes that the first account created becomes the administrator for that instance: Open WebUI Quick Start. ↩ ↩2
-
The Community-Scripts build framework defines the install menu, the 28-step Advanced wizard, the App Defaults file path, and the GPU passthrough flow that detects devices and writes
dev[n]entries into the LXC config: Community-Scripts build.func. ↩ ↩2 ↩3 ↩4 ↩5 ↩6 -
The current Open WebUI Community-Scripts entrypoint sets the built-in defaults, installs Open WebUI with
uv, createsopen-webui.service, stores data in/root/.open-webui, and updates Open WebUI withuv tool install --force ... open-webui[all]: Community-Scripts openwebui.sh. ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 -
Proxmox containers use the host kernel directly, and Ollama's Linux guide treats
nvidia-smias the verification step once NVIDIA drivers are available: Proxmox Container Toolkit, Ollama on Linux. ↩ -
Open WebUI's update guide recommends backing up data before upgrades, checking release notes, and pinning a release tag when stability matters: Updating Open WebUI. ↩
-
Ollama's Linux docs use
systemctl edit ollamafor service customization, document thecurl -fsSL https://ollama.com/install.sh | shinstall/update path, and show the systemd-managed service workflow: Ollama on Linux. ↩