Open WebUI And Ollama On Proxmox

This page owns one very specific Proxmox path: getting a usable local AI stack online with Open WebUI as the human-facing layer and Ollama as the inference engine.

That can mean one combined container when you want the shortest path to a working system. It can also mean splitting Ollama into its own inference node once the lab starts to care more about GPU discipline than convenience.

If the lab wants to finish that split and move Open WebUI into its own lightweight frontend container, continue to Open WebUI Standalone Frontend On Proxmox.

If you want the general fast path first, start with Proxmox Helper Scripts. This page is the deeper runbook for the exact Open WebUI and Ollama shape I run on Proxmox.

The host-side GPU work still belongs in GPU Passthrough On Proxmox. The broader maintenance rhythm still belongs in Update And Maintenance. This page is the workload runbook that sits on top of those decisions.

Combined Container: Open WebUI Plus Ollama

The first deployment pattern is the simplest one to live with.

You create a single LXC, let the community script install Open WebUI, accept the Ollama prompt, and end up with one box that handles both the frontend and the local model runtime. That matches Open WebUI's own split between bundled-Ollama deployments and deployments that point at a separate Ollama server later.¹

Create LXC Container

Container Sizing

Setting	Value	Rationale
Container ID	100	First AI container
Hostname	`openwebui-ollama`	Reflects combined deployment role
Disk Size	150 GB	Model storage (~4–30 GB per model) + app + Python deps
CPU Cores	12	Tokenization, prompt processing, CPU-offloaded layers
RAM	20000 MiB	VRAM spill for large models + Ollama + OpenWebUI overhead
OS	Debian 13	Default (recommended)
Bridge	`vmbr0`	Same network as other containers
IPv4	192.168.50.40/24	Static IP for Cloudflare tunnel / bookmarks
Gateway	`192.168.50.1`	Router
GPU Passthrough	Yes	Required — Ollama performs GPU inference in this container
Nesting	Enabled	Required for systemd in Debian 13

Pre-configure App Defaults (Skip the Wizard)

The Community-Scripts framework exposes a 28-step Advanced install path, but it also supports per-app defaults files at /usr/local/community-scripts/defaults/<app>.vars. If openwebui.vars exists, the script adds an App Defaults menu entry so you can build from that file instead of stepping through the full wizard.²³

For one-off runs, you can also pass var_* values directly on the command line. The file-based path is better when you want the container shape to stay repeatable across rebuilds.²³

Create the App Defaults File

On the Proxmox VE host shell:

# Create the directory (if it doesn't exist)
mkdir -p /usr/local/community-scripts/defaults
 
# Write the vars file
cat > /usr/local/community-scripts/defaults/openwebui.vars << 'EOF'
# OpenWebUI + Ollama (Combined Deployment) - App Defaults
# GPU enabled, Ollama will be installed alongside OpenWebUI
var_cpu=16
var_ram=30000
var_disk=400
var_unprivileged=1
var_gpu=yes
var_brg=vmbr0
var_net=192.168.50.40/24
var_gateway=192.168.50.1
var_hostname=openwebui-ollama
var_os=debian
var_version=13
var_ssh=yes
var_nesting=1
var_protection=yes
var_tags=ai;inference
var_timezone=Australia/Melbourne
var_container_storage=local-zfs
var_template_storage=local
EOF

Available Variables Reference

Variable	Description	Example
`var_cpu`	CPU cores	`12`
`var_ram`	RAM in MiB	`20000`
`var_disk`	Disk in GB	`150`
`var_unprivileged`	`1` = unprivileged, `0` = privileged	`1`
`var_gpu`	GPU passthrough	`yes` / `no`
`var_brg`	Network bridge	`vmbr0`
`var_net`	Static IP (CIDR)	`192.168.50.45/24`
`var_gateway`	Default gateway	`192.168.50.1`
`var_hostname`	Container hostname	`openwebui-ollama`
`var_os`	OS template	`debian`
`var_version`	OS version	`13`
`var_ssh`	Enable SSH	`yes` / `no`
`var_nesting`	Enable nesting (for systemd)	`1`
`var_protection`	Prevent accidental deletion	`yes`
`var_tags`	Proxmox tags (semicolon-separated)	`ai;inference`
`var_timezone`	Timezone	`Australia/Melbourne`
`var_container_storage`	Storage pool for container	`local-zfs`
`var_template_storage`	Storage pool for templates	`local`
`var_vlan`	VLAN tag	`50`
`var_mtu`	MTU size	`1500`

Run the Community Script

On the Proxmox VE host shell:

bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/openwebui.sh)"

With openwebui.vars in place, the current menu exposes Default Install, Advanced Install, User Defaults, App Defaults for Open WebUI, and Settings. Select App Defaults so the container uses the values you already wrote.²³

When prompted "Would you like to add Ollama?" → type Y (accept).

A successful run typically ends with output similar to:

  Using App Defaults for Open WebUI
  LXC Container 100 was successfully created.
 Detected NVIDIA GPU
 Found 6 NVIDIA device(s) for passthrough
  NVIDIA GPU passthrough configured (6 devices)
  Installed Open WebUI
      Would you like to add Ollama? <y/N> y
  Created Service
  Completed successfully!
    http://192.168.50.40:42

Alternative: One-Shot Environment Variables

For a single run without modifying any files, pass everything inline:

var_cpu=12 var_ram=20000 var_disk=150 var_gpu=yes var_hostname=openwebui-ollama \
var_net=192.168.50.40/24 var_gateway=192.168.50.1 var_nesting=1 \
bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/openwebui.sh)"

Environment variables have the highest priority — they override both .vars files and built-in defaults without modifying any files. You still need to select a menu option, but all values are pre-populated. Environment variables are useful for one-off runs when you do not want to persist a dedicated app defaults file.²³

What the Script Installs

The current community script installs Open WebUI as a uv tool, creates a systemd open-webui.service, and stores its persistent data under /root/.open-webui with optional environment overrides in /root/.env.³

Python 3.12 (via uv)
Open WebUI — web interface on port 8080
Ollama — LLM inference server (if selected)
systemd services — open-webui.service and ollama.service
Config file: /root/.env
Data directory: /root/.open-webui

Do NOT shut down the container yet — GPU passthrough must be configured first.

Configure GPU Passthrough

The source notes are right to treat this as the moment where convenience ends and correctness begins.

If the GPU wiring is wrong, the rest of the stack becomes a very elaborate CPU box.

For the host-side explanation and the full passthrough theory, use GPU Passthrough On Proxmox. The commands below are the workload-specific checks and fallback steps.

The current Community-Scripts build framework can detect Intel, AMD, and NVIDIA GPU device nodes and append dev[n] entries automatically when GPU passthrough is enabled, but you should still verify the resulting container config instead of assuming detection succeeded.²

If GPU Passthrough Didn't Auto-Configure

Check container config:

cat /etc/pve/lxc/100.conf | grep "dev[0-9]:"

You should see entries like:

dev0: /dev/nvidia0
dev1: /dev/nvidiactl
dev2: /dev/nvidia-uvm
...

If missing, edit the container config to add GPU device passthrough using the Proxmox native syntax:

nano /etc/pve/lxc/100.conf

Add at the end:

# GPU device passthrough (Proxmox native syntax)
dev0: /dev/nvidia0
dev1: /dev/nvidiactl
dev2: /dev/nvidia-uvm
dev3: /dev/nvidia-uvm-tools
dev4: /dev/nvidia-caps/nvidia-cap1
dev5: /dev/nvidia-caps/nvidia-cap2

Save and exit, then restart:

pct stop 100
pct start 100

Important: Use pct stop / pct start (full cycle), not pct reboot. A reboot does not re-read updated LXC config.

Install NVIDIA Driver in Container

The container shares the host kernel, so only userspace libraries are needed in the container itself. On the application side, Ollama's Linux docs still treat nvidia-smi as the verification point that the CUDA-capable driver stack is available.⁴

Push Driver from Host

# On the Proxmox host — push the .run file downloaded during host GPU setup
pct push 100 /tmp/NVIDIA-Linux-x86_64-580.126.09.run /tmp/NVIDIA-Linux-x86_64-580.126.09.run

Install Inside Container

# Enter the container
pct enter 100
 
# Make executable and install (compute-only, no display libraries)
chmod +x /tmp/NVIDIA-Linux-x86_64-580.126.09.run
 
./NVIDIA-Linux-x86_64-580.126.09.run \
  --no-kernel-module \
  --no-opengl-files

Flag	Why
`--no-kernel-module`	Host provides the kernel module
`--no-opengl-files`	No display server in container

Verify GPU Access

nvidia-smi

Expected: GPU name, driver version, memory, and temperature displayed.

exit  # Return to host shell

Verify And Access

Confirm Ollama Uses GPU

pct exec 100 -- ollama run llama3.2

First run downloads the model. Once loaded, check GPU utilisation:

# On the host (GPU monitoring)
nvtop

You should see the Ollama process using GPU memory.

Access Open WebUI

Open in browser:

http://<container-IP>:8080

To find the container IP:

pct exec 100 -- hostname -I

On first visit, create an admin account — this is local to the container. On first visit, create an admin account — Open WebUI's first account becomes the administrator for that instance.¹

Choose A Secure Exposure Path

Open WebUI should not stay on raw HTTP once other people, remote devices, or browser-integrated features start depending on it.

The detailed decision tree now lives in Secure Service Exposure On Proxmox. Use that subsection to choose one exposure model for the lab instead of letting every workload page grow its own slightly different HTTPS advice.

Cloudflare Tunnel On Proxmox - the shortest secure path when you want outbound-only exposure, no port forwarding, and Cloudflare Access in front of sensitive services.
Nginx Reverse Proxy LXC On Proxmox - the self-hosted reverse-proxy path when you want wildcard certificates and one front door for multiple services.
Individual Certificates On Proxmox With acme.sh - the tighter certificate-isolation path when each service deserves its own certificate lifecycle.

If the lab already exposes several services, make the exposure decision once in that subsection and keep this page focused on the Open WebUI and Ollama workload itself.

Management Commands

Service Control (Inside Container)

# Check status
systemctl status open-webui
systemctl status ollama
 
# Restart services
systemctl restart ollama
systemctl restart open-webui
 
# View logs
journalctl -u ollama -f
journalctl -u open-webui -f

Container Control (From Host)

# Start / Stop / Restart
pct start 100
pct stop 100
pct restart 100
 
# Enter container shell
pct enter 100
 
# Quick command execution
pct exec 100 -- nvidia-smi
pct exec 100 -- ollama list

Pull Models

# From host
pct exec 100 -- ollama pull llama3.2
pct exec 100 -- ollama pull codellama
 
# List installed models
pct exec 100 -- ollama list

Update Open WebUI

Open WebUI upstream recommends backing up data before upgrades and pinning a release tag when stability matters. The Community-Scripts Open WebUI installer currently manages the app with uv, so use that same mechanism when you update the container.³⁵

# From Proxmox host — take a snapshot first
pct snapshot 100 pre-openwebui-update --description "Before Open WebUI update"
 
# Enter the container
pct enter 100
 
# Stop the service
systemctl stop open-webui.service
 
# Match the community-scripts install method
~/.local/bin/uv tool install --force --python 3.12 --constraint <(echo "numba>=0.60") open-webui[all]
 
# Start the service again
systemctl start open-webui.service
 
# Verify version and health
curl -s http://localhost:8080/api/version
systemctl status open-webui.service

Access Open WebUI in your browser at http://192.168.50.40:8080 to confirm the update was successful.

Troubleshooting

Ollama Command Not Found

If the helper script completes but ollama isn't available:

pct enter 100
 
# Install Ollama manually
curl -fsSL https://ollama.com/install.sh | sh
 
# Start and verify
systemctl enable --now ollama
 
# Verify
ollama -v

The helper script sometimes skips Ollama installation if certain dependencies are missing or network issues occur during the install phase.

Alternative Deployment: Ollama As A Dedicated Inference Server

The source guide also kept an alternate shape for the moment when the combined container starts feeling too convenient for its own good.

If Open WebUI and Ollama deserve different lifecycles, different memory budgets, or cleaner GPU ownership, split them.

That split does not have to stop at the inference layer. If the next step is moving the browser UI out as well, use Open WebUI Standalone Frontend On Proxmox for the frontend-only container and migration flow.

Container Sizing (Standalone)

Setting	Value	Rationale
Container ID	101 (repurposed) or next available	Reuse existing GPU-configured container
Hostname	`ollama`	Reflects inference-only role
Disk Size	150 GB	Model storage (~4–30 GB per model in `~/.ollama/models/`)
CPU Cores	8–12	Tokenization, prompt processing, CPU-offloaded layers
RAM	24000–32000 MiB	VRAM spill for large models + Ollama overhead
OS	Debian 13	Consistent with existing setup
Bridge	`vmbr0`	Same network as OpenWebUI
IPv4	192.168.50.40/24	Standard Ollama IP (OpenWebUI keeps `.30`)
Gateway	`192.168.50.1`	Router
GPU Passthrough	Yes	The whole point — GPU inference
Nesting	Enabled	Required for systemd in Debian 13

Model VRAM Reference

Model	Parameters	Quantisation	VRAM Required
Llama 3.2	3B	Q4_K_M	~2 GB
Llama 3.1	8B	Q4_K_M	~5 GB
Llama 3.1	8B	Q8_0	~9 GB
DeepSeek-R1	14B	Q4_K_M	~9 GB
Qwen 2.5	32B	Q4_K_M	~20 GB
Llama 3.1	70B	Q4_K_M	~40 GB (spills to RAM)

Rule of thumb: If the model fits in 24 GB VRAM, it runs at full GPU speed. If it exceeds VRAM, Ollama automatically offloads layers to CPU/RAM — this works but inference is slower for the CPU-offloaded layers.

Option A: Repurpose Existing CT 100

pct enter 100
 
# Stop and disable OpenWebUI
systemctl disable --now open-webui.service
 
# Verify it's stopped
systemctl status open-webui
# Expected: inactive (dead), disabled

Clean Up OpenWebUI (Optional — Frees Disk Space)

# Remove OpenWebUI binary and Python environment
# Try uv first (may not be in PATH depending on script version)
uv tool uninstall open-webui 2>/dev/null || \
  ~/.local/bin/uv tool uninstall open-webui 2>/dev/null || \
  echo "uv not found — removing files manually"
 
# Remove uv-managed tool installs and cache
rm -rf /root/.local/share/uv /root/.local/bin/open-webui
 
# Remove data (ONLY after migrating to the new OpenWebUI container!)
rm -rf /root/.open-webui
 
# Remove env file if no longer needed
# (Keep it if it contains Ollama-relevant settings)
rm /root/.env

Change the IP Address

From the Proxmox host:

# Stop the container
pct stop 100
 
# Update the network config
pct set 100 -net0 name=eth0,bridge=vmbr0,ip=192.168.50.40/24,gw=192.168.50.1
 
# Update the hostname
pct set 100 -hostname ollama
 
# Resize resources (optional — give more RAM for inference)
pct set 100 -memory 24000
# CPU cores can stay at 12 or adjust:
# pct set 100 -cores 12
 
# Start the container
pct start 100

Verify inside the container:

pct enter 100
hostname -I
# Expected: 192.168.50.40

Option B: Create A New Container

On the Proxmox VE host shell:

# Back up the existing OpenWebUI vars (if any)
cp /usr/local/community-scripts/defaults/openwebui.vars \
   /usr/local/community-scripts/defaults/openwebui.vars.bak 2>/dev/null
 
# Create the directory (if it doesn't exist)
mkdir -p /usr/local/community-scripts/defaults
 
# Write Ollama-specific vars
cat > /usr/local/community-scripts/defaults/openwebui.vars << 'EOF'
# Ollama Standalone (GPU Inference Server) - App Defaults
# GPU enabled, Ollama will be installed, OpenWebUI disabled after install
var_cpu=12
var_ram=24000
var_disk=150
var_unprivileged=1
var_gpu=yes
var_brg=vmbr0
var_net=192.168.50.40/24
var_gateway=192.168.50.1
var_hostname=ollama
var_os=debian
var_version=13
var_ssh=yes
var_nesting=1
var_protection=yes
var_tags=ai;inference
var_timezone=Australia/Melbourne
var_container_storage=local-zfs
var_template_storage=local
EOF

Run the script again:

bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/openwebui.sh)"

Select App Defaults so the container uses the file you just wrote.²³

When prompted "Would you like to add Ollama?" → type Y (accept).

Restore the earlier OpenWebUI vars if you backed them up:

mv /usr/local/community-scripts/defaults/openwebui.vars.bak \
   /usr/local/community-scripts/defaults/openwebui.vars

Disable the OpenWebUI service after the install:

pct enter <CT-ID>
 
systemctl disable --now open-webui.service

Configure Network Listening (Standalone)

If Open WebUI will connect to Ollama over the network instead of through localhost inside the same container, add a systemd override for OLLAMA_HOST and any other runtime tuning you want to keep across restarts. Ollama's Linux docs use systemctl edit ollama for this kind of customization.⁶

pct enter 100
 
sudo systemctl edit ollama.service

Add:

[Service]
Environment="OLLAMA_HOST=0.0.0.0:58008"
Environment="OLLAMA_FLASH_ATTENTION=1"
Environment="OLLAMA_KV_CACHE_TYPE=q8_0"
Environment="OLLAMA_KEEP_ALIVE=5m"

Apply and verify:

sudo systemctl daemon-reload
sudo systemctl restart ollama
 
ss -tlnp | grep 58008

Expected:

LISTEN  0  4096  0.0.0.0:58008  0.0.0.0:*  users:(("ollama",pid=...,fd=...))

If it shows 127.0.0.1 instead of 0.0.0.0, the override didn't apply. Check:

systemctl cat ollama.service

Variable Reference

Variable	Default	Recommended	Effect
`OLLAMA_HOST`	`127.0.0.1:11434`	`0.0.0.0:58008`	Listen on all interfaces for remote access on non-standard port
`OLLAMA_FLASH_ATTENTION`	`0`	`1`	Enables Flash Attention — significantly reduces memory at large context sizes. No quality impact.
`OLLAMA_KV_CACHE_TYPE`	`f16`	`q8_0`	Quantises the KV cache to 8-bit — halves KV cache memory with minimal quality loss. Allows larger context windows or bigger models.
`OLLAMA_KEEP_ALIVE`	`5m`	`5m` or `-1`	How long models stay in VRAM after last request. `5m` = unload after 5 min idle (good when sharing GPU with llama.cpp). `-1` = never unload (good for dedicated Ollama use).
`OLLAMA_NUM_PARALLEL`	`1`	`1`–`4`	Max parallel requests per model. Higher values increase context memory proportionally.
`OLLAMA_MAX_LOADED_MODELS`	`3 × num_GPUs`	`1`	Max models loaded concurrently. Set to `1` if VRAM is tight (single RTX 3090).
`OLLAMA_ORIGINS`	`127.0.0.1,0.0.0.0`	Add OpenWebUI origin if needed	CORS allowed origins (browser-level only, not a security boundary).

GPU Passthrough On Proxmox — the host-side NVIDIA and LXC groundwork this stack depends on.
Update And Maintenance — where the lab-wide update order and GPU driver version discipline live.
Monitoring And Alerts — the place to watch GPU health and service behavior once this stack is in use.
Container Network Throttling — handy when model downloads or package pulls should keep moving without flattening the rest of the network.
Open WebUI Standalone Frontend On Proxmox — the follow-on page when inference stays remote and the UI deserves its own lightweight guest.

Open WebUI documents both the bundled-Ollama path and the "Ollama on a different server" path in its quick start guide, and notes that the first account created becomes the administrator for that instance: Open WebUI Quick Start. ↩ ↩²
The Community-Scripts build framework defines the install menu, the 28-step Advanced wizard, the App Defaults file path, and the GPU passthrough flow that detects devices and writes dev[n] entries into the LXC config: Community-Scripts build.func. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
The current Open WebUI Community-Scripts entrypoint sets the built-in defaults, installs Open WebUI with uv, creates open-webui.service, stores data in /root/.open-webui, and updates Open WebUI with uv tool install --force ... open-webui[all]: Community-Scripts openwebui.sh. ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷
Proxmox containers use the host kernel directly, and Ollama's Linux guide treats nvidia-smi as the verification step once NVIDIA drivers are available: Proxmox Container Toolkit, Ollama on Linux. ↩
Open WebUI's update guide recommends backing up data before upgrades, checking release notes, and pinning a release tag when stability matters: Updating Open WebUI. ↩
Ollama's Linux docs use systemctl edit ollama for service customization, document the curl -fsSL https://ollama.com/install.sh | sh install/update path, and show the systemd-managed service workflow: Ollama on Linux. ↩

Comments