Documentation Index
Fetch the complete documentation index at: https://docs-omnicoreagent.omnirexfloralabs.com/llms.txt
Use this file to discover all available pages before exploring further.
Getting Started with OmniCoreAgent
Welcome to the OmniCoreAgent learning path. This guide takes you from writing your first line of code to building production-ready, autonomous agents with persistent memory, context management, and guardrails.
Follow the examples in order — each one builds on the concepts from the previous.
📚 The Learning Path
| # | File | Key Concepts |
|---|
| 1 | first_agent.py | The Basics: Initialize OmniCoreAgent and run a simple query |
| 2 | agent_with_models.py | Models: Switch providers (OpenAI, Anthropic, Gemini, Groq, Ollama) |
| 3 | agent_with_local_tools.py | Local Tools: Register Python functions as agent tools |
| 4 | agent_with_mcp_tools.py | MCP Integration: Connect to external MCP servers |
| 5 | agent_with_all_tools.py | Hybrid Architecture: Combine local + MCP tools |
| 6 | agent_with_memory.py | Persistence: Store conversations in Redis, Postgres, MongoDB |
| 7 | agent_with_memory_switching.py | Runtime Switching: Change memory backends on the fly |
| 8 | agent_with_events.py | Telemetry: Replay run events, stream progress, and retrieve traces |
| 9 | agent_with_context_management.py | 🆕 Context Management: Keep long conversations within a configured context budget |
| 10 | agent_with_guardrails.py | 🆕 Guardrails: Protect against prompt injection |
| 11 | agent_with_metrics.py | 🆕 Metrics: Track tokens, requests, and latency |
| 12 | agent_with_sub_agents.py | 🆕 Sub-Agents: Build multi-agent systems |
| 13 | agent_configuration.py | Advanced Config: All settings in one place |
🎯 “I just want to…”
| Goal | Example |
|---|
| Build my first agent | first_agent.py |
| Use a different LLM (Claude, Gemini, etc.) | agent_with_models.py |
| Give my agent tools | agent_with_local_tools.py |
| Connect to MCP servers | agent_with_mcp_tools.py |
| Save conversation history | agent_with_memory.py |
| Handle long conversations | agent_with_context_management.py |
| Protect against attacks | agent_with_guardrails.py |
| Track usage for cost estimation | agent_with_metrics.py |
| Build multi-agent systems | agent_with_sub_agents.py |
🛠️ Prerequisites
pip install omnicoreagent
# Most hosted model providers need only this key. The cookbook loader reads .env.
echo "LLM_API_KEY=your_key_here" > .env
The examples start with in-memory defaults. Add REDIS_URL, DATABASE_URL, or
MONGODB_URI only when you intentionally run the persistence examples.
📖 Key Concepts
Memory with Summarization
"memory_config": {
"mode": "sliding_window",
"value": 50,
"summary": {
"enabled": True,
"retention_policy": "keep"
}
}
Old messages are summarized, not lost.
Context Management
"context_management": {
"enabled": True,
"mode": "token_budget", # or "sliding_window"
"value": 100000,
"threshold_percent": 75,
"strategy": "summarize_and_truncate",
"preserve_recent": 6
}
Long conversations stay within the context budget you configure.
Choosing the Right Mode
| Mode | Triggers When | Best For |
|---|
sliding_window | Message count exceeds value | Conversational agents with short messages |
token_budget | Token count exceeds value × threshold% | Tool-heavy agents with large responses |
Trade-offs:
| sliding_window | token_budget |
|---|
| Token efficiency | ✅ Better (smaller contexts) | ⚠️ Larger contexts per call |
| Predictability | ✅ Consistent behavior | Depends on message size |
| Large messages | ⚠️ Can exceed limits | ✅ Handles safely |
| Cost | ✅ Lower cumulative | Higher cumulative |
Recommendations:
- Chatbots / Q&A agents: Use
sliding_window with value: 10-20
- Tool-heavy agents (APIs, web scraping): Use
token_budget with value: 8000-16000
- Mixed workloads: Use
token_budget with lower threshold (50-60%)
"tool_offload": {
"enabled": True,
"threshold_tokens": 500, # Offload if response > 500 tokens
"max_preview_tokens": 150 # Show first 150 tokens in context
}
Large tool responses are automatically saved into the active workspace artifacts/ area, with only a preview in context.
How it works:
- Tool returns large response (e.g., web search with 50 results)
- Response saved to
workspace/artifacts/
- Agent sees preview + file reference in context
- Agent uses
read_artifact() tool to get full content when needed
Token savings example:
| Tool Response | Without Offloading | With Offloading |
|---|
| Web search (50 results) | ~10,000 tokens | ~200 tokens |
| Large API response | ~5,000 tokens | ~150 tokens |
| File read (1000 lines) | ~8,000 tokens | ~200 tokens |
Tool offloading adds 4 artifact tools:
read_artifact(artifact_id) - Read full content
tail_artifact(artifact_id, lines) - Read last N lines
search_artifact(artifact_id, query) - Search within artifact
list_artifacts() - List all offloaded artifacts
Workspace files are separate and enabled by default with workspace-scoped command
tools such as ls, read_file, write_file, glob, and grep.
💡 Inspired by Cursor’s “dynamic context discovery” and Anthropic’s context engineering patterns
Guardrails
"guardrail_config": {
"strict_mode": True
}
Built-in protection against prompt injection attacks.
Metrics
metrics = await agent.get_metrics()
# Returns: total_requests, total_tokens, total_request_tokens, total_response_tokens
Track usage for cost control and monitoring.
🚀 Next Steps