OmniServe — Production API Server

Turn any agent into a production-ready REST/SSE API with a single command.

Agent File Requirements

Your Python file must define one of the following:

# Option 1: Define an `agent` variable
from omnicoreagent import OmniCoreAgent

agent = OmniCoreAgent(
    name="MyAgent",
    system_instruction="You are a helpful assistant.",
    model_config={"provider": "gemini", "model": "gemini-2.0-flash"},
)

# Option 2: Define a `create_agent()` function
from omnicoreagent import OmniCoreAgent

def create_agent():
    """Factory function that returns an agent instance."""
    return OmniCoreAgent(
        name="MyAgent",
        system_instruction="You are a helpful assistant.",
        model_config={"provider": "gemini", "model": "gemini-2.0-flash"},
    )

OmniServe looks for agent variable first, then create_agent() function. Your file must export one of these.

Quick Start

Step 1: Create your agent file (`my_agent.py`)

from omnicoreagent import OmniCoreAgent, ToolRegistry

tools = ToolRegistry()

@tools.register_tool("greet")
def greet(name: str) -> str:
    """Greet someone by name."""
    return f"Hello, {name}!"

@tools.register_tool("calculate")
def calculate(expression: str) -> dict:
    """Evaluate a math expression."""
    import math
    result = eval(expression, {"__builtins__": {}}, {"sqrt": math.sqrt, "pi": math.pi})
    return {"expression": expression, "result": result}

agent = OmniCoreAgent(
    name="MyAgent",
    system_instruction="You are a helpful assistant with access to greeting and calculation tools.",
    model_config={"provider": "gemini", "model": "gemini-2.0-flash"},
    local_tools=tools,
)

Step 2: Set environment variables

echo "LLM_API_KEY=your_api_key_here" > .env

Step 3: Run the server

omniserve run --agent my_agent.py

Step 4: Test the API

# Health check
curl http://localhost:8000/health

# Run a query (sync)
curl -X POST http://localhost:8000/run/sync \
  -H "Content-Type: application/json" \
  -d '{"query": "Greet Alice and calculate 2+2"}'

# Run a query (streaming SSE)
curl -X POST http://localhost:8000/run \
  -H "Content-Type: application/json" \
  -d '{"query": "What is sqrt(144)?"}'

# Open interactive docs
open http://localhost:8000/docs

CLI Commands

Command	Description
`omniserve run`	Run your agent file as API server
`omniserve quickstart`	Zero-code server with defaults
`omniserve config`	View or generate configuration
`omniserve generate-dockerfile`	Generate production Dockerfile

CLI Options: `omniserve run`

omniserve run \
  --agent my_agent.py \        # Path to agent file (required)
  --host 0.0.0.0 \             # Host to bind (default: 0.0.0.0)
  --port 8000 \                # Port to bind (default: 8000)
  --workers 1 \                # Worker processes (default: 1)
  --auth-token YOUR_TOKEN \    # Enable Bearer token auth
  --rate-limit 100 \           # Rate limit (requests per minute)
  --cors-origins "*" \         # Comma-separated CORS origins
  --no-docs \                  # Disable Swagger UI
  --reload                     # Enable hot reload (development)

Examples:

# Basic run
omniserve run --agent my_agent.py

# With authentication
omniserve run --agent my_agent.py --auth-token secret123

# With rate limiting
omniserve run --agent my_agent.py --rate-limit 100

# Production settings
omniserve run --agent my_agent.py \
  --port 8000 \
  --auth-token $AUTH_TOKEN \
  --rate-limit 100 \
  --cors-origins "https://myapp.com,https://api.myapp.com"

# Development with hot reload
omniserve run --agent my_agent.py --reload

CLI Options: `omniserve quickstart`

Start a server instantly without writing any code:

omniserve quickstart \
  --provider openai \          # LLM provider (openai, gemini, anthropic)
  --model gpt-4o \             # Model name
  --name QuickAgent \          # Agent name (default: QuickAgent)
  --instruction "You are..." \ # System instruction
  --port 8000                  # Port (default: 8000)

Examples:

# OpenAI
omniserve quickstart --provider openai --model gpt-4o

# Google Gemini
omniserve quickstart --provider gemini --model gemini-2.0-flash

# Anthropic Claude
omniserve quickstart --provider anthropic --model claude-3-5-sonnet-20241022

API Endpoints

Method	Endpoint	Auth	Description
`POST`	`/run`	Yes*	SSE streaming response
`POST`	`/run/sync`	Yes*	JSON response (blocking)
`GET`	`/health`	No	Health check
`GET`	`/ready`	No	Readiness check
`GET`	`/prometheus`	No	Prometheus metrics
`GET`	`/tools`	Yes*	List available tools
`GET`	`/metrics`	Yes*	Agent usage metrics
`GET`	`/docs`	No	Swagger UI
`GET`	`/redoc`	No	ReDoc UI

*Auth required only if --auth-token is set.

Request/Response Examples

# Sync request (with auth)
curl -X POST http://localhost:8000/run/sync \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{"query": "What is 2+2?", "session_id": "user123"}'

# Response:
# {"response": "2+2 equals 4", "session_id": "user123", ...}

# Streaming SSE request
curl -X POST http://localhost:8000/run \
  -H "Content-Type: application/json" \
  -d '{"query": "Explain quantum computing"}'

# List tools
curl http://localhost:8000/tools \
  -H "Authorization: Bearer YOUR_TOKEN"

Environment Variables

All settings via OMNISERVE_* prefix. Environment variables always override code values.

Variable	Default	Description
`OMNISERVE_HOST`	`0.0.0.0`	Server host
`OMNISERVE_PORT`	`8000`	Server port
`OMNISERVE_WORKERS`	`1`	Worker processes
`OMNISERVE_API_PREFIX`	`""`	API path prefix (e.g., `/api/v1`)
`OMNISERVE_ENABLE_DOCS`	`true`	Swagger UI at `/docs`
`OMNISERVE_ENABLE_REDOC`	`true`	ReDoc at `/redoc`
`OMNISERVE_CORS_ENABLED`	`true`	Enable CORS
`OMNISERVE_CORS_ORIGINS`	`*`	Allowed origins (comma-separated)
`OMNISERVE_CORS_CREDENTIALS`	`true`	Allow credentials
`OMNISERVE_AUTH_ENABLED`	`false`	Enable Bearer token auth
`OMNISERVE_AUTH_TOKEN`	—	Bearer token value
`OMNISERVE_RATE_LIMIT_ENABLED`	`false`	Enable rate limiting
`OMNISERVE_RATE_LIMIT_REQUESTS`	`100`	Requests per window
`OMNISERVE_RATE_LIMIT_WINDOW`	`60`	Window in seconds
`OMNISERVE_REQUEST_LOGGING`	`true`	Log requests
`OMNISERVE_LOG_LEVEL`	`INFO`	Log level (DEBUG/INFO/WARNING/ERROR)
`OMNISERVE_REQUEST_TIMEOUT`	`300`	Request timeout in seconds

Example .env file:

# Required
LLM_API_KEY=your_api_key_here

# OmniServe settings
OMNISERVE_PORT=8000
OMNISERVE_AUTH_ENABLED=true
OMNISERVE_AUTH_TOKEN=my-secret-token
OMNISERVE_RATE_LIMIT_ENABLED=true
OMNISERVE_RATE_LIMIT_REQUESTS=100
OMNISERVE_CORS_ORIGINS=https://myapp.com,https://api.myapp.com

Docker Deployment

Generate a Dockerfile

omniserve generate-dockerfile --file my_agent.py

Build and run

docker build -t omniserver .
docker run -p 8000:8000 -e LLM_API_KEY=$LLM_API_KEY omniserver

Smart Configuration — The generator inspects your agent and configures storage automatically:

Your Agent Uses	Dockerfile Sets
No memory tools	`AGENT_PATH`, `OMNICOREAGENT_ARTIFACTS_DIR`
Local memory	+ `OMNICOREAGENT_MEMORY_DIR=/tmp/memories`
S3/R2 memory	Pass credentials at runtime with `-e`

Cloud deployment examples

# Local memory (ephemeral)
docker run -p 8000:8000 -e LLM_API_KEY=$LLM_API_KEY omniserver

# AWS S3 memory (persistent)
docker run -p 8000:8000 \
  -e LLM_API_KEY=$LLM_API_KEY \
  -e AWS_S3_BUCKET=my-bucket \
  -e AWS_ACCESS_KEY_ID=... \
  -e AWS_SECRET_ACCESS_KEY=... \
  -e AWS_REGION=us-east-1 \
  omniserver

# Cloudflare R2 memory (persistent)
docker run -p 8000:8000 \
  -e LLM_API_KEY=$LLM_API_KEY \
  -e R2_BUCKET_NAME=my-bucket \
  -e R2_ACCOUNT_ID=... \
  -e R2_ACCESS_KEY_ID=... \
  -e R2_SECRET_ACCESS_KEY=... \
  omniserver

Python API (Programmatic Control)

For full programmatic control, use OmniServe directly in your Python script:

from omnicoreagent import OmniCoreAgent, OmniServe, OmniServeConfig, ToolRegistry

tools = ToolRegistry()

@tools.register_tool("get_time")
def get_time() -> dict:
    from datetime import datetime
    return {"time": datetime.now().isoformat()}

agent = OmniCoreAgent(
    name="MyAgent",
    system_instruction="You are a helpful assistant.",
    model_config={"provider": "gemini", "model": "gemini-2.0-flash"},
    local_tools=tools,
)

config = OmniServeConfig(
    host="0.0.0.0",
    port=8000,
    auth_enabled=True,
    auth_token="my-secret-token",
    rate_limit_enabled=True,
    rate_limit_requests=100,
    rate_limit_window=60,
    cors_origins=["*"],
    enable_docs=True,
)

if __name__ == "__main__":
    server = OmniServe(agent, config=config)
    server.start()

Run with Python directly:

echo "LLM_API_KEY=your_api_key" > .env
python server.py

CLI vs Python API:

omniserve run --agent my_agent.py — CLI loads your agent file and applies CLI flags
python server.py — You control everything programmatically via OmniServeConfig

Environment Variable Precedence: .env variables always override values set in OmniServeConfig.

Advanced: Resilience Patterns

Import retry and circuit breaker for custom use:

from omnicoreagent import RetryConfig, CircuitBreaker, with_retry

@with_retry(RetryConfig(max_retries=5, strategy="exponential"))
async def call_external_api():
    ...

breaker = CircuitBreaker("api", failure_threshold=3, timeout=60)
async with breaker:
    result = await risky_call()

OmniServe is perfect for deploying agents as microservices, webhooks, chatbots, or any HTTP-accessible AI capability.

Learn More: See OmniServe Cookbook for more examples.

Get Started

Core Concepts

How-To Guides

Changelog

OmniServe

OmniServe — Production API Server

Agent File Requirements

Quick Start

Step 1: Create your agent file (`my_agent.py`)

Step 2: Set environment variables

Step 3: Run the server

Step 4: Test the API

CLI Commands

CLI Options: `omniserve run`

CLI Options: `omniserve quickstart`

API Endpoints

Request/Response Examples

Environment Variables

Docker Deployment

Generate a Dockerfile

Build and run

Cloud deployment examples

Python API (Programmatic Control)

Advanced: Resilience Patterns

Get Started

Core Concepts

How-To Guides

Changelog

​OmniServe — Production API Server

​Agent File Requirements

​Quick Start

​Step 1: Create your agent file (my_agent.py)

​Step 2: Set environment variables

​Step 3: Run the server

​Step 4: Test the API

​CLI Commands

​CLI Options: omniserve run

​CLI Options: omniserve quickstart

​API Endpoints

​Request/Response Examples

​Environment Variables

​Docker Deployment

​Generate a Dockerfile

​Build and run

​Cloud deployment examples

​Python API (Programmatic Control)

​Advanced: Resilience Patterns

OmniServe — Production API Server

Agent File Requirements

Quick Start

Step 1: Create your agent file (`my_agent.py`)

Step 2: Set environment variables

Step 3: Run the server

Step 4: Test the API

CLI Commands

CLI Options: `omniserve run`

CLI Options: `omniserve quickstart`

API Endpoints

Request/Response Examples

Environment Variables

Docker Deployment

Generate a Dockerfile

Build and run

Cloud deployment examples

Python API (Programmatic Control)

Advanced: Resilience Patterns