OmniServe Cookbook

Production-ready API server examples for OmniCoreAgent.

📦 Agent File Requirements

Your agent file must define one of the following:

# Option 1: `agent` variable
agent = OmniCoreAgent(...)

# Option 2: `create_agent()` function
def create_agent():
    return OmniCoreAgent(...)

Examples

Example	Description	How to Run
cli_agent.py	Agent file for CLI deployment	`omniserve run --agent cookbook/omniserve/cli_agent.py`
python_api.py	Full Python API with all config options	`python cookbook/omniserve/python_api.py`
real_application_agent.py	Serve the support operations real application harness	`uv run omniserve run --agent cookbook/omniserve/real_application_agent.py`

Quick Start

Option 1: CLI (Zero-code deployment)

# Quickstart with defaults (no agent file needed)
omniserve quickstart --provider openai --model gpt-4o-mini

# Run with your agent file
omniserve run --agent cookbook/omniserve/cli_agent.py --port 8000

# Run a real application harness
uv run omniserve run --agent cookbook/omniserve/real_application_agent.py --port 8000

# With authentication and rate limiting
omniserve run --agent cookbook/omniserve/cli_agent.py \
  --port 8000 \
  --auth-token secret \
  --rate-limit 100

Use the provider that matches your LLM_API_KEY. For an OpenAI key, run omniserve quickstart --provider openai --model gpt-4o-mini.

Option 2: Python API (Programmatic control)

# Run Python script directly
python cookbook/omniserve/python_api.py

[!WARNING] Environment Variable Precedence: Environment variables ALWAYS override values set in OmniServeConfig.
# In code: port=8000
# In environment: OMNICOREAGENT_SERVE_PORT=9000
# Result: Server runs on port 9000 (env wins!)

Environment Variables

Server settings use the OMNICOREAGENT_SERVE_* prefix. Background task settings use the OMNICOREAGENT_BACKGROUND_* prefix. You can start without either prefix; defaults use port 8000, in-memory background task state, and no auth/rate limiting until you opt in.

Variable	Default	Description
`OMNICOREAGENT_SERVE_HOST`	`0.0.0.0`	Server host. Must not be empty
`OMNICOREAGENT_SERVE_PORT`	`8000`	Server port. Must be `1`-`65535`
`OMNICOREAGENT_SERVE_WORKERS`	`1`	Direct OmniServe worker count. Must be `1`; scale by running multiple processes
`OMNICOREAGENT_SERVE_API_PREFIX`	`""`	API path prefix. Normalized to a leading slash with no trailing slash; whitespace is invalid
`OMNICOREAGENT_SERVE_ENABLE_DOCS`	`true`	Swagger UI at `/docs`
`OMNICOREAGENT_SERVE_ENABLE_REDOC`	`true`	ReDoc at `/redoc`
`OMNICOREAGENT_SERVE_CORS_ENABLED`	`true`	Enable CORS
`OMNICOREAGENT_SERVE_CORS_ORIGINS`	`*`	Allowed origins
`OMNICOREAGENT_SERVE_CORS_METHODS`	`*`	Allowed methods
`OMNICOREAGENT_SERVE_CORS_HEADERS`	`*`	Allowed headers
`OMNICOREAGENT_SERVE_CORS_CREDENTIALS`	`true`	Allow credentials
`OMNICOREAGENT_SERVE_AUTH_ENABLED`	`false`	Enable Bearer auth. Requires a non-empty auth token
`OMNICOREAGENT_SERVE_AUTH_TOKEN`	—	Bearer token value used when auth is enabled
`OMNICOREAGENT_SERVE_RATE_LIMIT_ENABLED`	`false`	Rate limiting
`OMNICOREAGENT_SERVE_RATE_LIMIT_REQUESTS`	`100`	Requests/window. Must be at least `1` when enabled
`OMNICOREAGENT_SERVE_RATE_LIMIT_WINDOW`	`60`	Window seconds. Must be at least `1` when enabled
`OMNICOREAGENT_SERVE_REQUEST_LOGGING`	`true`	Log requests
`OMNICOREAGENT_SERVE_LOG_LEVEL`	`INFO`	Log level: `CRITICAL`, `ERROR`, `WARNING`, `INFO`, `DEBUG`, or `TRACE`
`OMNICOREAGENT_SERVE_REQUEST_TIMEOUT`	`300`	Timeout seconds
`OMNICOREAGENT_BACKGROUND_ENABLED`	`true`	Expose background task endpoints
`OMNICOREAGENT_BACKGROUND_AGENT_ID`	`default`	Agent id for the served agent in background tasks
`OMNICOREAGENT_BACKGROUND_TASK_STORE`	`in_memory`	Background task store backend: `in_memory`, `sql`, `redis`, or `mongodb`
`OMNICOREAGENT_BACKGROUND_TASK_STORE_URL`	—	SQL or Redis task store URL. Use `OMNICOREAGENT_BACKGROUND_TASK_STORE=redis` for Redis URLs
`OMNICOREAGENT_BACKGROUND_TASK_STORE_URI`	—	MongoDB task store URI
`OMNICOREAGENT_BACKGROUND_TASK_STORE_DATABASE`	`omnicoreagent`	MongoDB database name
`OMNICOREAGENT_BACKGROUND_TASK_STORE_PREFIX`	—	Redis key prefix
`OMNICOREAGENT_BACKGROUND_TASK_STORE_COLLECTION_PREFIX`	—	MongoDB collection prefix
`OMNICOREAGENT_BACKGROUND_TASK_STORE_CONNECT_TIMEOUT`	—	Backend connect timeout in seconds
`OMNICOREAGENT_BACKGROUND_START_WORKER`	`true`	Start scheduler and worker loop

API Endpoints

Core Endpoints

Method	Endpoint	Auth	Description
POST	`/run`	Yes*	SSE streaming response
POST	`/run/sync`	Yes*	JSON response
GET	`/health`	No	Health check
GET	`/ready`	No	Readiness check
GET	`/prometheus`	No	Prometheus metrics
GET	`/tools`	Yes*	List available tools
GET	`/metrics`	Yes*	Agent usage metrics
GET	`/events/{session_id}`	Yes*	Replay and follow telemetry events over SSE
GET	`/events/{session_id}/list`	Yes*	Return stored telemetry events as JSON
GET	`/events/{session_id}/trace`	Yes*	Return the latest telemetry trace summary
GET	`/telemetry/events`	Yes*	Return stored telemetry events by trace, run, session, task, or event type; default `limit=200`
GET	`/telemetry/events/stream`	Yes*	Replay and follow telemetry over SSE for a session
GET	`/telemetry/traces`	Yes*	List traces by trace, run, session, task, agent, workflow, model, or status; default `limit=100`
GET	`/telemetry/traces/{trace_id}`	Yes*	Return one exact trace
GET	`/telemetry/runs/{run_id}/trace`	Yes*	Return the latest trace for one run
GET	`/telemetry/sessions/{session_id}/trace`	Yes*	Return the latest trace for one session
GET	`/sessions/{session_id}/history`	Yes*	Return conversation history
GET	`/docs`	No	Swagger UI
GET	`/redoc`	No	ReDoc UI

/ready becomes true after server startup finishes, the agent is initialized, and configured MCP servers are connected. Local-only agents with no MCP servers do not need an MCP client for readiness.

Background Task Endpoints

These routes are mounted when background execution is enabled. It is enabled by default and can be turned off with OMNICOREAGENT_BACKGROUND_ENABLED=false or OmniServeConfig(background_enabled=False).

Method	Endpoint	Auth	Description
POST	`/background/agents`	Yes*	Register a background agent
GET	`/background/agents`	Yes*	List background agents
GET	`/background/agents/{agent_id}`	Yes*	Inspect a background agent
DELETE	`/background/agents/{agent_id}`	Yes*	Delete a background agent
POST	`/background/tasks`	Yes*	Create a background task
GET	`/background/tasks`	Yes*	List background tasks
GET	`/background/tasks/{task_id}`	Yes*	Inspect a task
PATCH	`/background/tasks/{task_id}`	Yes*	Patch a task
POST	`/background/tasks/{task_id}/run`	Yes*	Queue or synchronously execute a manual run
POST	`/background/tasks/{task_id}/pause`	Yes*	Pause scheduled dispatch
POST	`/background/tasks/{task_id}/resume`	Yes*	Resume scheduled dispatch
DELETE	`/background/tasks/{task_id}`	Yes*	Delete a task
POST	`/background/runs/{run_id}/cancel`	Yes*	Cancel a queued or running run
GET	`/background/runs`	Yes*	List runs
GET	`/background/runs/{run_id}`	Yes*	Inspect run state
GET	`/background/runs/{run_id}/attempts`	Yes*	List run attempts
GET	`/background/runs/{run_id}/events`	Yes*	Replay lifecycle events
GET	`/background/runs/{run_id}/workspace`	Yes*	Inspect run workspace files

*Auth required only if OMNICOREAGENT_SERVE_AUTH_ENABLED=true or --auth-token is set. POST /background/tasks/{task_id}/run accepts {"wait": true} when the client needs terminal run state in the response. With the worker enabled, OmniServe waits on the durable run record. With the worker disabled, OmniServe executes the run inline through the background manager execution path. If the run does not finish before the background wait budget, the response is 504. The wait budget is derived from the configured request timeout and leaves a small margin for OmniServe to return the structured response before the outer HTTP timeout. The detail payload includes the run_id, latest status, wait_timeout_seconds, and request_timeout_seconds so the run can still be inspected through /background/runs/{run_id}.

Docker Deployment

# Generate Dockerfile
omniserve generate-dockerfile --file cookbook/omniserve/cli_agent.py

# Build and run
docker build -t omnicoreagent-serve .
docker run -p 8000:8000 -e LLM_API_KEY=$LLM_API_KEY omnicoreagent-serve

Cloud Deployment (Cloud Run, AWS Fargate, Railway)

The generated Dockerfile is deterministic. It does not import or execute the agent file. It sets:

AGENT_PATH to the in-container agent path
OMNICOREAGENT_WORKSPACE_BACKEND=local
OMNICOREAGENT_WORKSPACE_DIR=/tmp/workspace

The agent file must be inside the current Docker build context. For S3/R2 workspace persistence, pass backend and credentials at runtime:

docker run -p 8000:8000 \
  -e LLM_API_KEY=$LLM_API_KEY \
  -e OMNICOREAGENT_WORKSPACE_BACKEND=s3 \
  -e AWS_S3_BUCKET=my-bucket \
  -e AWS_ACCESS_KEY_ID=... \
  -e AWS_SECRET_ACCESS_KEY=... \
  omnicoreagent-serve

​OmniServe Cookbook

​📦 Agent File Requirements

​Examples

​Quick Start

​Option 1: CLI (Zero-code deployment)

​Option 2: Python API (Programmatic control)

​Environment Variables

​API Endpoints

​Core Endpoints

​Background Task Endpoints

​Docker Deployment

​Cloud Deployment (Cloud Run, AWS Fargate, Railway)