Testing Strategy
Overview
nanobot uses pytest with pytest-asyncio for testing. The test suite focuses on unit testing of core components with lightweight test doubles (no heavy mocking frameworks). Tests are designed to run fast without external services.
Test Infrastructure
Framework
Tool |
Version |
Purpose |
|---|---|---|
pytest |
≥ 9.0 |
Test runner and assertions |
pytest-asyncio |
≥ 1.3 |
Async test support |
ruff |
≥ 0.1 |
Linting (not testing, but part of CI quality) |
Configuration (pyproject.toml)
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]
asyncio_mode = "auto": All async test functions are automatically treated as async tests (no need for@pytest.mark.asyncioon each test, though it’s still used explicitly in the codebase for clarity).testpaths = ["tests"]: Test discovery starts from thetests/directory.
Running Tests
# Run all tests
pytest -s tests/
# Run a specific test file
pytest -s tests/test_tool_validation.py
# Run tests matching a pattern
pytest -k "test_heartbeat" tests/
# Run with verbose output
pytest -v tests/
# Run with coverage (if coverage is installed)
pytest --cov=nanobot tests/
Test Organization
Test Files
File |
Tests |
Component |
|---|---|---|
|
Tool parameter validation, JSON Schema checks |
|
|
Heartbeat idempotency, LLM decision handling |
|
|
Cron job scheduling, firing, cancellation |
|
|
Cron command parsing and dispatch |
|
|
CLI command parsing |
|
|
CLI input handling |
|
|
Prompt cache optimization |
|
|
Agent loop turn saving |
|
|
Message tool behavior |
|
|
Message suppression logic |
|
|
Email channel parsing |
|
|
Matrix channel integration |
|
|
Feishu message formatting |
|
|
Memory consolidation edge cases |
|
|
Consolidation offset tracking |
|
|
Task cancellation logic |
|
Testing Patterns
Test Doubles
The codebase uses simple, custom test doubles rather than heavy mocking:
DummyProvider — A minimal LLM provider that returns pre-configured responses:
class DummyProvider:
def __init__(self, responses: list[LLMResponse]):
self._responses = list(responses)
async def chat(self, *args, **kwargs) -> LLMResponse:
if self._responses:
return self._responses.pop(0)
return LLMResponse(content="", tool_calls=[])
SampleTool — A minimal tool for testing parameter validation:
class SampleTool(Tool):
@property
def name(self) -> str:
return "sample"
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"query": {"type": "string", "minLength": 2},
"count": {"type": "integer", "minimum": 1, "maximum": 10},
},
"required": ["query", "count"],
}
async def execute(self, **kwargs: Any) -> str:
return "ok"
Async Testing
All async tests use the @pytest.mark.asyncio decorator and can use tmp_path for isolated filesystem:
@pytest.mark.asyncio
async def test_heartbeat_decide_skip(tmp_path) -> None:
provider = DummyProvider([LLMResponse(content="no tool call", tool_calls=[])])
service = HeartbeatService(workspace=tmp_path, provider=provider, model="test")
action, tasks = await service._decide("heartbeat content")
assert action == "skip"
Filesystem Isolation
Tests that touch the filesystem use pytest’s tmp_path fixture:
@pytest.mark.asyncio
async def test_session_persistence(tmp_path) -> None:
manager = SessionManager(tmp_path)
session = manager.get_or_create("test:123")
session.add_message("user", "Hello")
manager.save(session)
# Reload from disk
manager2 = SessionManager(tmp_path)
session2 = manager2.get_or_create("test:123")
assert len(session2.messages) == 1
Validation Testing
Tool parameter validation is tested thoroughly with edge cases:
def test_validate_params_missing_required() -> None:
tool = SampleTool()
errors = tool.validate_params({"query": "hi"})
assert "missing required count" in "; ".join(errors)
def test_validate_params_nested_object_and_array() -> None:
tool = SampleTool()
errors = tool.validate_params({
"query": "hi", "count": 2,
"meta": {"flags": [1, "ok"]},
})
assert any("missing required meta.tag" in e for e in errors)
assert any("meta.flags[0] should be string" in e for e in errors)
What Is Tested
Core Agent
Tool parameter validation: JSON Schema validation (types, ranges, enums, nested objects, arrays, required fields)
Tool registry: Tool lookup, error messages for missing tools, validation error propagation
Agent loop: Turn saving, session persistence
Context building: Prompt cache optimization
Memory & Sessions
Memory consolidation: LLM-driven consolidation, offset tracking, edge cases with different argument types
Session management: Message append, history retrieval, orphaned tool result cleanup
Channels
Email: IMAP message parsing, SMTP formatting
Matrix: Room event handling
Feishu: Post content formatting (rich text)
Services
Heartbeat: Idempotent start, skip/run decisions, LLM tool call handling
Cron: Job scheduling (at/every/cron), firing, cancellation, store persistence
CLI
Command parsing: Typer command registration
Input handling: Terminal input, exit commands
What Is NOT Tested (Known Gaps)
End-to-end flows: No integration tests that run the full gateway with real LLM providers
Channel connectivity: No tests for actual WebSocket/API connections (would require external services)
LLM response quality: No tests for actual LLM output (would require API keys)
Concurrent access: No stress tests for simultaneous messages from multiple channels
Provider-specific behavior: No tests for LiteLLM prefixing, env var injection, or gateway detection
Adding New Tests
Naming Convention
tests/test_{module_or_feature}.py
Template
"""Tests for {feature}."""
import pytest
from nanobot.module import SomeClass
def test_descriptive_behavior_name() -> None:
"""Test that SomeClass does X when Y."""
obj = SomeClass()
result = obj.method()
assert result == expected
@pytest.mark.asyncio
async def test_async_behavior(tmp_path) -> None:
"""Test async operation with filesystem isolation."""
obj = SomeClass(workspace=tmp_path)
result = await obj.async_method()
assert result is not None
Best Practices
Keep tests fast: No network calls, no sleeping, no heavy I/O
Use
tmp_path: Never write to real filesystem pathsMinimal test doubles: Simple classes that return pre-configured data
One assertion focus: Each test should verify one specific behavior
Descriptive names:
test_{what_is_tested}_{condition}_{expected_result}