# Data Model and API ## Overview nanobot does not use a traditional database. All persistent data is stored as **files** on the local filesystem, primarily under `~/.nanobot/`. This document describes the data models, storage formats, configuration schema, and the internal/external APIs. ## Data Storage ### Storage Locations | Data | Path | Format | |------|------|--------| | Configuration | `~/.nanobot/config.json` | JSON | | Sessions | `~/.nanobot/workspace/sessions/{key}.jsonl` | JSONL (one JSON object per line) | | Long-term memory | `~/.nanobot/workspace/memory/MEMORY.md` | Markdown | | History log | `~/.nanobot/workspace/memory/HISTORY.md` | Markdown (append-only) | | Heartbeat tasks | `~/.nanobot/workspace/HEARTBEAT.md` | Markdown (checkbox list) | | Cron jobs | In-memory (managed by `CronService`) | Runtime only | | Skills (workspace) | `~/.nanobot/workspace/skills/{name}/SKILL.md` | Markdown | | Skills (built-in) | `nanobot/skills/{name}/SKILL.md` | Markdown | | Bootstrap files | `~/.nanobot/workspace/AGENTS.md`, `SOUL.md`, `USER.md`, `TOOLS.md`, `IDENTITY.md` | Markdown | ### Session Storage (JSONL) Each session is stored as a `.jsonl` file. The first line is always a metadata record; subsequent lines are messages. **Metadata record** (line 1): ```json { "_type": "metadata", "key": "telegram:12345", "created_at": "2026-03-01T10:00:00", "updated_at": "2026-03-15T14:30:00", "metadata": {}, "last_consolidated": 42 } ``` **Message records** (lines 2+): ```json { "role": "user", "content": "What is the weather today?", "timestamp": "2026-03-15T14:30:00.123456" } ``` ```json { "role": "assistant", "content": null, "tool_calls": [{"id": "call_abc", "function": {"name": "web_search", "arguments": "{\"query\": \"weather today\"}"}}], "timestamp": "2026-03-15T14:30:01.234567" } ``` ```json { "role": "tool", "content": "Sunny, 22°C", "tool_call_id": "call_abc", "name": "web_search", "timestamp": "2026-03-15T14:30:02.345678" } ``` **Key design decisions**: - **Append-only**: Messages are never modified or deleted from the list — only `last_consolidated` advances - **JSONL format**: Each message is a single line for easy streaming and grep-ability - **Session key**: `{channel}:{chat_id}` (e.g., `telegram:12345`, `cli:default`) ### Memory Storage Two-layer memory system in `~/.nanobot/workspace/memory/`: | File | Purpose | Update Pattern | |------|---------|---------------| | `MEMORY.md` | Long-term facts about the user | Overwritten during consolidation (LLM rewrites entire content) | | `HISTORY.md` | Chronological event log | Append-only (new entries added at end) | **Consolidation process**: When the session exceeds a threshold (`memory_window`), older messages are summarized by the LLM via the `save_memory` tool call, producing: - A `history_entry`: timestamped paragraph appended to `HISTORY.md` - A `memory_update`: full updated `MEMORY.md` content ## Core Data Models ### Event Types (`nanobot/bus/events.py`) ```python @dataclass class InboundMessage: channel: str # "telegram", "discord", "slack", ... sender_id: str # User identifier chat_id: str # Chat/channel identifier content: str # Message text timestamp: datetime # When received media: list[str] # Media URLs metadata: dict[str, Any] # Channel-specific data session_key_override: str | None # Thread-scoped session key @dataclass class OutboundMessage: channel: str # Target channel chat_id: str # Target chat content: str # Response text reply_to: str | None # Reply-to message ID media: list[str] # Media attachments metadata: dict[str, Any] # Channel-specific data ``` ### Provider Types (`nanobot/providers/base.py`) ```python @dataclass class ToolCallRequest: id: str # Unique call ID name: str # Tool name arguments: dict[str, Any] # Tool parameters @dataclass class LLMResponse: content: str | None # Text response tool_calls: list[ToolCallRequest] # Tool call requests finish_reason: str # "stop", "tool_calls", etc. usage: dict[str, int] # Token usage stats reasoning_content: str | None # DeepSeek-R1, Kimi reasoning thinking_blocks: list[dict] | None # Anthropic extended thinking ``` ### Session Model (`nanobot/session/manager.py`) ```python @dataclass class Session: key: str # "channel:chat_id" messages: list[dict[str, Any]] # Append-only message list created_at: datetime updated_at: datetime metadata: dict[str, Any] last_consolidated: int # Index of last consolidated message ``` ### Cron Types (`nanobot/cron/types.py`) ```python @dataclass class CronSchedule: kind: str # "at" | "every" | "cron" at_ms: int | None # Absolute timestamp (for "at") every_ms: int | None # Interval (for "every") expr: str | None # Cron expression (for "cron") tz: str | None # Timezone (for "cron") @dataclass class CronJob: id: str description: str schedule: CronSchedule state: CronJobState payload: CronPayload ``` ## Configuration Schema The root `Config` object (`nanobot/config/schema.py`) is a Pydantic `BaseSettings` model: ``` Config ├── agents: AgentsConfig │ └── defaults: AgentDefaults │ ├── workspace: str = "~/.nanobot/workspace" │ ├── model: str = "anthropic/claude-opus-4-5" │ ├── provider: str = "auto" │ ├── max_tokens: int = 8192 │ ├── temperature: float = 0.1 │ ├── max_tool_iterations: int = 40 │ ├── memory_window: int = 100 │ └── reasoning_effort: str | None ├── channels: ChannelsConfig │ ├── send_progress: bool = True │ ├── send_tool_hints: bool = False │ ├── telegram: TelegramConfig │ ├── discord: DiscordConfig │ ├── whatsapp: WhatsAppConfig │ ├── slack: SlackConfig │ ├── feishu: FeishuConfig │ ├── dingtalk: DingTalkConfig │ ├── qq: QQConfig │ ├── email: EmailConfig │ ├── matrix: MatrixConfig │ └── mochat: MochatConfig ├── providers: ProvidersConfig │ ├── custom: ProviderConfig │ ├── anthropic: ProviderConfig │ ├── openai: ProviderConfig │ ├── openrouter: ProviderConfig │ ├── deepseek: ProviderConfig │ ├── groq: ProviderConfig │ ├── gemini: ProviderConfig │ ├── moonshot: ProviderConfig │ ├── zhipu: ProviderConfig │ ├── dashscope: ProviderConfig │ ├── minimax: ProviderConfig │ ├── aihubmix: ProviderConfig │ ├── siliconflow: ProviderConfig │ ├── volcengine: ProviderConfig │ ├── vllm: ProviderConfig │ ├── openai_codex: ProviderConfig │ └── github_copilot: ProviderConfig ├── gateway: GatewayConfig │ ├── host: str = "0.0.0.0" │ ├── port: int = 18790 │ └── heartbeat: HeartbeatConfig │ ├── enabled: bool = True │ └── interval_s: int = 1800 └── tools: ToolsConfig ├── restrict_to_workspace: bool = False ├── web: WebToolsConfig │ ├── proxy: str | None │ └── search: WebSearchConfig │ ├── api_key: str │ └── max_results: int = 5 ├── exec: ExecToolConfig │ ├── timeout: int = 60 │ └── path_append: str └── mcp_servers: dict[str, MCPServerConfig] ``` **Config loading**: `~/.nanobot/config.json` → `json.load()` → `_migrate_config()` → `Config.model_validate()`. **Key feature**: The `alias_generator=to_camel` on the `Base` model means both `"apiKey"` (camelCase) and `"api_key"` (snake_case) are accepted in JSON config, enabling compatibility with Claude Desktop / Cursor MCP configs. ## Tool API (OpenAI Function Calling Format) All tools are exposed to the LLM as OpenAI-format function schemas. Example: ```json { "type": "function", "function": { "name": "read_file", "description": "Read the contents of a file.", "parameters": { "type": "object", "properties": { "path": { "type": "string", "description": "The file path to read." } }, "required": ["path"] } } } ``` ### Tool Parameters Summary | Tool | Required Parameters | Optional Parameters | |------|-------------------|-------------------| | `read_file` | `path: string` | — | | `write_file` | `path: string`, `content: string` | — | | `edit_file` | `path: string`, `old_text: string`, `new_text: string` | — | | `list_dir` | `path: string` | — | | `exec` | `command: string` | — | | `web_search` | `query: string` | — | | `web_fetch` | `url: string` | — | | `message_user` | `content: string` | — | | `spawn` | `task: string` | — | | `cron` | varies by action | `action`, `id`, `description`, `schedule` | ## External API Integrations nanobot does not expose its own HTTP API. It connects to external services: | Service | Protocol | Purpose | |---------|----------|---------| | LLM providers | HTTPS (OpenAI-compatible) | Chat completions with tool calling | | Brave Search | HTTPS | Web search results | | MCP servers | stdio / HTTP (SSE) | External tool execution | | Chat platforms | WebSocket / HTTP long-poll / IMAP | Message exchange | ## Related Documentation - [Architecture](02-architecture.md) — Component design - [Workflows](03-workflows.md) — Data flow through the system - [Conventions](05-conventions.md) — Code conventions --- **Last Updated**: 2026-03-15 **Version**: 1.0