LocalAI: One API for All Your AI Models

LocalAI is an API gateway that sits in front of your local AI models and exposes an OpenAI-compatible REST API. If you have multiple LLM backends running, LocalAI unifies them under a single endpoint. Write code against the OpenAI SDK, then switch to local models with a single line change.

The Problem It Solves

Normally switching from OpenAI to a local model means rewriting your entire API call layer. LocalAI eliminates this — point your base_url at LocalAI and your existing code works unchanged.

Installation

# Binary (simplest)
curl -s https://raw.githubusercontent.com/mudler/LocalAI/master/hack/localai.sh | bash

# Docker (recommended)
docker pull quay.io/go-skynet/local-ai:latest
docker run -d --name localai -p 8080:8080 -v $(pwd)/models:/models quay.io/go-skynet/local-ai:latest

API Usage

LocalAI supports the full OpenAI API surface:

# Chat completions
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3", "messages": [{"role": "user", "content": "Hello"}]}'

# Embeddings
curl http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"model": "nomic-embed-text", "input": "Hello world"}'

Drop-In OpenAI Replacement

from openai import OpenAI
client = OpenAI(api_key="not-needed", base_url="http://localhost:8080/v1")
# Everything else stays exactly the same!
response = client.chat.completions.create(model="llama3", messages=[...])
print(response.choices[0].message.content)

Multiple Backends

Configure in config.yaml:

preload_models:
  - name: "llama3"
    backend: ollama
    ollama_base_url: http://127.0.0.1:11434
  - name: "mistral-7b"
    backend: llama.cpp
    model_file: mistral-7b-instruct.Q4_K_M.gguf

Systemd Service

sudo cat > /etc/systemd/system/localai.service << EOF
[Unit]Description=LocalAI ServiceAfter=network.target[Service]Type=simpleExecStart=/usr/local/bin/localai --config-path /etc/localai/config.yamlRestart=always[Install]WantedBy=multi-user.targetEOF

Troubleshooting

Connection refused: Check sudo systemctl status localai

Model not found: Verify with curl http://localhost:8080/v1/models