Getting Started with OmniCoreAgent

Welcome to the OmniCoreAgent learning path. This guide takes you from writing your first line of code to building production-ready, autonomous agents with persistent memory, context management, and guardrails. Follow the examples in order — each one builds on the concepts from the previous.

📚 The Learning Path

#	File	Key Concepts
1	first_agent.py	The Basics: Initialize `OmniCoreAgent` and run a simple query
2	agent_with_models.py	Models: Switch providers (OpenAI, Anthropic, Gemini, Groq, Ollama)
3	agent_with_local_tools.py	Local Tools: Register Python functions as agent tools
4	agent_with_mcp_tools.py	MCP Integration: Connect to external MCP servers
5	agent_with_all_tools.py	Hybrid Architecture: Combine local + MCP tools
6	agent_with_memory.py	Persistence: Store conversations in Redis, Postgres, MongoDB
7	agent_with_memory_switching.py	Runtime Switching: Change memory backends on the fly
8	agent_with_events.py	Telemetry: Replay run events, stream progress, and retrieve traces
9	agent_with_context_management.py	🆕 Context Management: Keep long conversations within a configured context budget
10	agent_with_guardrails.py	🆕 Guardrails: Protect against prompt injection
11	agent_with_metrics.py	🆕 Metrics: Track tokens, requests, and latency
12	agent_with_sub_agents.py	🆕 Sub-Agents: Build multi-agent systems
13	agent_configuration.py	Advanced Config: All settings in one place

🎯 “I just want to…”

Goal	Example
Build my first agent	first_agent.py
Use a different LLM (Claude, Gemini, etc.)	agent_with_models.py
Give my agent tools	agent_with_local_tools.py
Connect to MCP servers	agent_with_mcp_tools.py
Save conversation history	agent_with_memory.py
Handle long conversations	agent_with_context_management.py
Protect against attacks	agent_with_guardrails.py
Track usage for cost estimation	agent_with_metrics.py
Build multi-agent systems	agent_with_sub_agents.py

🛠️ Prerequisites

pip install omnicoreagent

# Most hosted model providers need only this key. The cookbook loader reads .env.
echo "LLM_API_KEY=your_key_here" > .env

The examples start with in-memory defaults. Add REDIS_URL, DATABASE_URL, or MONGODB_URI only when you intentionally run the persistence examples.

📖 Key Concepts

Memory with Summarization

"memory_config": {
    "mode": "sliding_window",
    "value": 50,
    "summary": {
        "enabled": True,
        "retention_policy": "keep"
    }
}

Old messages are summarized, not lost.

Context Management

"context_management": {
    "enabled": True,
    "mode": "token_budget",  # or "sliding_window"
    "value": 100000,
    "threshold_percent": 75,
    "strategy": "summarize_and_truncate",
    "preserve_recent": 6
}

Long conversations stay within the context budget you configure.

Choosing the Right Mode

Mode	Triggers When	Best For
`sliding_window`	Message count exceeds `value`	Conversational agents with short messages
`token_budget`	Token count exceeds `value × threshold%`	Tool-heavy agents with large responses

Trade-offs:

	`sliding_window`	`token_budget`
Token efficiency	✅ Better (smaller contexts)	⚠️ Larger contexts per call
Predictability	✅ Consistent behavior	Depends on message size
Large messages	⚠️ Can exceed limits	✅ Handles safely
Cost	✅ Lower cumulative	Higher cumulative

Recommendations:

Chatbots / Q&A agents: Use sliding_window with value: 10-20
Tool-heavy agents (APIs, web scraping): Use token_budget with value: 8000-16000
Mixed workloads: Use token_budget with lower threshold (50-60%)

Tool Response Offloading

"tool_offload": {
    "enabled": True,
    "threshold_tokens": 500,  # Offload if response > 500 tokens
    "max_preview_tokens": 150  # Show first 150 tokens in context
}

Large tool responses are automatically saved into the active workspace artifacts/ area, with only a preview in context. How it works:

Tool returns large response (e.g., web search with 50 results)
Response saved to workspace/artifacts/
Agent sees preview + file reference in context
Agent uses read_artifact() tool to get full content when needed

Token savings example:

Tool Response	Without Offloading	With Offloading
Web search (50 results)	~10,000 tokens	~200 tokens
Large API response	~5,000 tokens	~150 tokens
File read (1000 lines)	~8,000 tokens	~200 tokens

Tool offloading adds 4 artifact tools:

read_artifact(artifact_id) - Read full content
tail_artifact(artifact_id, lines) - Read last N lines
search_artifact(artifact_id, query) - Search within artifact
list_artifacts() - List all offloaded artifacts

Workspace files are separate and enabled by default with workspace-scoped command tools such as ls, read_file, write_file, glob, and grep.

💡 Inspired by Cursor’s “dynamic context discovery” and Anthropic’s context engineering patterns

Guardrails

"guardrail_config": {
    "strict_mode": True
}

Built-in protection against prompt injection attacks.

Metrics

metrics = await agent.get_metrics()
# Returns: total_requests, total_tokens, total_request_tokens, total_response_tokens

Track usage for cost control and monitoring.

🚀 Next Steps

Workflows: Chain agents together (Sequential, Parallel, Router)
Background Agents: Scheduled autonomous tasks
Production: Metrics and guardrails

​Getting Started with OmniCoreAgent

​📚 The Learning Path

​🎯 “I just want to…”

​🛠️ Prerequisites

​📖 Key Concepts

​Memory with Summarization

​Context Management

​Choosing the Right Mode

​Tool Response Offloading

​Guardrails

​Metrics

​🚀 Next Steps