Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs-omnicoreagent.omnirexfloralabs.com/llms.txt

Use this file to discover all available pages before exploring further.

Getting Started with OmniCoreAgent

Welcome to the OmniCoreAgent learning path. This guide takes you from writing your first line of code to building production-ready, autonomous agents with persistent memory, context management, and guardrails. Follow the examples in order — each one builds on the concepts from the previous.

📚 The Learning Path

#FileKey Concepts
1first_agent.pyThe Basics: Initialize OmniCoreAgent and run a simple query
2agent_with_models.pyModels: Switch providers (OpenAI, Anthropic, Gemini, Groq, Ollama)
3agent_with_local_tools.pyLocal Tools: Register Python functions as agent tools
4agent_with_mcp_tools.pyMCP Integration: Connect to external MCP servers
5agent_with_all_tools.pyHybrid Architecture: Combine local + MCP tools
6agent_with_memory.pyPersistence: Store conversations in Redis, Postgres, MongoDB
7agent_with_memory_switching.pyRuntime Switching: Change memory backends on the fly
8agent_with_events.pyTelemetry: Replay run events, stream progress, and retrieve traces
9agent_with_context_management.py🆕 Context Management: Keep long conversations within a configured context budget
10agent_with_guardrails.py🆕 Guardrails: Protect against prompt injection
11agent_with_metrics.py🆕 Metrics: Track tokens, requests, and latency
12agent_with_sub_agents.py🆕 Sub-Agents: Build multi-agent systems
13agent_configuration.pyAdvanced Config: All settings in one place

🎯 “I just want to…”

GoalExample
Build my first agentfirst_agent.py
Use a different LLM (Claude, Gemini, etc.)agent_with_models.py
Give my agent toolsagent_with_local_tools.py
Connect to MCP serversagent_with_mcp_tools.py
Save conversation historyagent_with_memory.py
Handle long conversationsagent_with_context_management.py
Protect against attacksagent_with_guardrails.py
Track usage for cost estimationagent_with_metrics.py
Build multi-agent systemsagent_with_sub_agents.py

🛠️ Prerequisites

pip install omnicoreagent

# Most hosted model providers need only this key. The cookbook loader reads .env.
echo "LLM_API_KEY=your_key_here" > .env
The examples start with in-memory defaults. Add REDIS_URL, DATABASE_URL, or MONGODB_URI only when you intentionally run the persistence examples.

📖 Key Concepts

Memory with Summarization

"memory_config": {
    "mode": "sliding_window",
    "value": 50,
    "summary": {
        "enabled": True,
        "retention_policy": "keep"
    }
}
Old messages are summarized, not lost.

Context Management

"context_management": {
    "enabled": True,
    "mode": "token_budget",  # or "sliding_window"
    "value": 100000,
    "threshold_percent": 75,
    "strategy": "summarize_and_truncate",
    "preserve_recent": 6
}
Long conversations stay within the context budget you configure.

Choosing the Right Mode

ModeTriggers WhenBest For
sliding_windowMessage count exceeds valueConversational agents with short messages
token_budgetToken count exceeds value × threshold%Tool-heavy agents with large responses
Trade-offs:
sliding_windowtoken_budget
Token efficiency✅ Better (smaller contexts)⚠️ Larger contexts per call
Predictability✅ Consistent behaviorDepends on message size
Large messages⚠️ Can exceed limits✅ Handles safely
Cost✅ Lower cumulativeHigher cumulative
Recommendations:
  • Chatbots / Q&A agents: Use sliding_window with value: 10-20
  • Tool-heavy agents (APIs, web scraping): Use token_budget with value: 8000-16000
  • Mixed workloads: Use token_budget with lower threshold (50-60%)

Tool Response Offloading

"tool_offload": {
    "enabled": True,
    "threshold_tokens": 500,  # Offload if response > 500 tokens
    "max_preview_tokens": 150  # Show first 150 tokens in context
}
Large tool responses are automatically saved into the active workspace artifacts/ area, with only a preview in context. How it works:
  1. Tool returns large response (e.g., web search with 50 results)
  2. Response saved to workspace/artifacts/
  3. Agent sees preview + file reference in context
  4. Agent uses read_artifact() tool to get full content when needed
Token savings example:
Tool ResponseWithout OffloadingWith Offloading
Web search (50 results)~10,000 tokens~200 tokens
Large API response~5,000 tokens~150 tokens
File read (1000 lines)~8,000 tokens~200 tokens
Tool offloading adds 4 artifact tools:
  • read_artifact(artifact_id) - Read full content
  • tail_artifact(artifact_id, lines) - Read last N lines
  • search_artifact(artifact_id, query) - Search within artifact
  • list_artifacts() - List all offloaded artifacts
Workspace files are separate and enabled by default with workspace-scoped command tools such as ls, read_file, write_file, glob, and grep.
💡 Inspired by Cursor’s “dynamic context discovery” and Anthropic’s context engineering patterns

Guardrails

"guardrail_config": {
    "strict_mode": True
}
Built-in protection against prompt injection attacks.

Metrics

metrics = await agent.get_metrics()
# Returns: total_requests, total_tokens, total_request_tokens, total_response_tokens
Track usage for cost control and monitoring.

🚀 Next Steps