How to Deploy Onyx Open Source AI Chat with Docker Compose 2025

How to Deploy Onyx Open Source AI Chat with Docker Compose 2025

Onyx is gaining traction as a developer-friendly alternative to proprietary AI chat platforms. If you want to self-host a feature-rich LLM interface with RAG, web search, and agent capabilities without relying on third-party services, Onyx solves that problem. This guide walks you through deploying Onyx using Docker Compose on your local machine or remote server.

Why Deploy Onyx with Docker Compose?

Onyx supports multiple deployment modes—standard and lite—each with different resource requirements. Docker Compose is ideal for developers because it:

  • Handles multi-container orchestration without Kubernetes complexity
  • Runs on macOS, Linux, and Windows with Docker Desktop
  • Requires minimal configuration changes between environments
  • Works for both development and small-to-medium production deployments
  • Plays well with your existing LLM provider (OpenAI, Anthropic, Ollama, etc.)

The Onyx Lite mode uses under 1GB memory, making it perfect for local testing or modest server deployments.

Prerequisites

Before starting, ensure you have:

  • Docker (v20.10+) and Docker Compose (v2.0+) installed
  • 2-4 GB RAM available (Lite mode) or 8GB+ for standard mode
  • A supported LLM provider:
    • Proprietary: OpenAI, Anthropic Claude, Google Gemini
    • Self-hosted: Ollama, LiteLLM, vLLM
    • Optional: Serper or Google PSE API key for web search
  • Git (to clone the repository)

Step 1: Clone the Onyx Repository

Start by cloning the official Onyx repository:

git clone https://github.com/onyx-dot-app/onyx.git
cd onyx

Verify the clone was successful:

ls -la | grep docker

You should see docker-compose.yml or a deployment/ folder containing compose files.

Step 2: Choose Your Deployment Mode

Onyx offers two Docker Compose configurations:

| Mode | Memory | Best For | Features | |------|--------|----------|----------| | Lite | <1GB | Local testing, lightweight chat | Chat UI, basic agents | | Standard | 8GB+ | Production, advanced RAG | Full RAG, deep research, all agents |

For local development or quick testing, use Lite mode. For production with indexing and advanced retrieval, use Standard.

Step 3: Configure Environment Variables

Create a .env file in the project root:

cat > .env << EOF
# LLM Provider Configuration
LLM_PROVIDER=openai  # Options: openai, anthropic, ollama, etc.
LLM_API_KEY=sk-your-openai-key-here
LLM_MODEL=gpt-4o-mini  # Adjust based on your provider

# Deployment Mode
ONYX_LITE=true  # Set to false for standard mode

# Web Search (Optional)
WEB_SEARCH_PROVIDER=serper  # or google_pse, brave, none
WEB_SEARCH_API_KEY=your-serper-key

# Security
POSTGRES_PASSWORD=secure_password_change_me
JWT_SECRET=your-secret-key-change-me

# Server Settings
PORT=8080
HOST=0.0.0.0
EOF

Important: Replace sk-your-openai-key-here with your actual API key. For Ollama users, set LLM_PROVIDER=ollama and LLM_API_KEY can be left empty.

Step 4: Modify docker-compose.yml for Lite Mode

If using Lite mode, update the Docker Compose file to reduce resource allocation:

version: '3.8'

services:
  onyx-backend:
    image: onyxdotapp/onyx:latest
    container_name: onyx-backend
    environment:
      - ONYX_LITE=${ONYX_LITE}
      - LLM_PROVIDER=${LLM_PROVIDER}
      - LLM_API_KEY=${LLM_API_KEY}
      - LLM_MODEL=${LLM_MODEL}
    ports:
      - "${PORT}:8080"
    depends_on:
      - postgres
    networks:
      - onyx-network
    restart: unless-stopped

  postgres:
    image: postgres:15-alpine
    container_name: onyx-postgres
    environment:
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
      - POSTGRES_DB=onyx
    volumes:
      - postgres_data:/var/lib/postgresql/data
    networks:
      - onyx-network
    restart: unless-stopped

volumes:
  postgres_data:

networks:
  onyx-network:
    driver: bridge

Step 5: Launch Onyx with Docker Compose

Build and start the containers:

docker-compose up -d

Monitor the startup process:

docker-compose logs -f onyx-backend

Wait for the message indicating the service is ready (typically 30-60 seconds):

INFO:     Application startup complete

Step 6: Access Onyx

Once running, access Onyx at:

http://localhost:8080

You should see the Onyx chat interface. If deploying to a remote server, replace localhost with your server's IP or domain name.

Step 7: Verify LLM Connection

In the Onyx UI, go to SettingsLLM Provider to confirm your model is connected. Try a simple prompt:

"What is the current date?"

If using web search, test it with:

"Search for the latest Docker updates 2025"

Troubleshooting Common Issues

Issue: "Connection refused" or service won't start

Solution: Check logs for errors:

docker-compose logs postgres
docker-compose logs onyx-backend

If PostgreSQL fails, ensure the volume path has proper permissions:

chmod 755 ./postgres_data

Issue: LLM API key not recognized

Solution: Verify the .env file is loaded:

docker-compose config | grep LLM_API_KEY

If empty, stop containers and restart:

docker-compose down
docker-compose up -d

Issue: Out of memory (OOM) errors

Solution: Ensure Lite mode is enabled in .env:

ONYX_LITE=true

For standard mode, increase Docker Desktop memory allocation to 8GB+.

Optional: Add Custom Connectors and Actions

Onyx supports 50+ indexing connectors and MCP (Model Context Protocol) integrations. To add a connector:

  1. Place connector config in ./connectors/ volume
  2. Restart the container:
docker-compose restart onyx-backend

Common connectors include Slack, GitHub, Notion, and Google Drive.

Next Steps

  • Review Onyx documentation for custom agent setup
  • Deploy to production using Docker Swarm or Kubernetes (guides available in docs)
  • Set up automated indexing for your knowledge base
  • Configure voice mode and code execution for advanced features

Summary

Deploying Onyx with Docker Compose takes under 10 minutes and gives you a self-hosted AI chat platform with RAG and agent capabilities. Use Lite mode for testing and Standard for production RAG workloads. The containerized approach ensures consistency across environments and simplifies scaling later.

Recommended Tools