Architecture Overview

The Defining Principle

Harness-First Engineering — the harness isn't an afterthought; it's the foundation.

In the rapidly evolving landscape of AI-assisted software engineering, the distinction between what the model knows and what the system guarantees has become the defining architectural question. This project embraces a fundamental principle: the model provides intelligence, the harness provides reliability.

This is not merely a division of labor — it is an architectural constraint that shapes every design decision, from the permission system that guards dangerous operations to the context management that maximizes the utility of limited context windows.

Five Architectural Advantages

a. Harness-First Engineering

When building an AI agent system that operates with the same privileges as its user, reliability cannot be layered on top — it must be embedded from the first line of code.

Permission control, timeout protection, output truncation, and session persistence aren't features — they're the safety net that makes AI agents production-ready. The model can hallucinate, misinterpret instructions, or choose the wrong tool. But the harness must never permit a destructive operation without authorization, must never hang indefinitely, and must never lose the user's work.

Property	Guarantee
Permission Control	No destructive operation executes without user consent or policy approval
Timeout Protection	Misbehaving tools (runaway grep, infinite loop) cannot hang the session
Output Truncation	Single tool response cannot consume the entire context window
Session Persistence	Agent resumes after interruption without losing conversation history

b. Context Window as Scarce Resource

Modern LLMs offer generous context windows, but "generous" is a relative term when representing an entire codebase, multiple files, and ongoing conversation history. This architecture treats context as a scarce resource to be managed, not a luxury to be spent freely.

Lazy Tool Loading: Tools are registered in a central registry; only tool definitions (name, description, input schema) are sent to the model — not their implementations
250-Character Skill Descriptions: Force discipline in how capabilities are communicated to the model
Three-Tier Compaction: Messages retained in full until threshold, then compressed into summaries that preserve key information while reducing token count

c. Layered Permission Defense

Trust is not binary, and neither is authorization. This system implements a three-tier permission model that reflects real-world security thinking:

Level	Capability	Use Case
ReadOnly	File reading, globbing, grep searching	Safe exploration of codebases
WorkspaceWrite	File creation, modification, editing	Controlled development work
DangerFullAccess	Shell execution, deletions, network operations	Full automation with explicit approval

The permission system doesn't merely check a level — it evaluates each operation against glob rule matching to determine whether specific paths or operations are permitted. A policy might allow editing files in /src/ while blocking modifications to /config/ or /tests/.

Session memory means the system remembers approval decisions within a session, preventing repetitive prompts for the same operation. A human-in-the-loop approval mode exists for operations requiring explicit confirmation before execution — critical for destructive operations or shell commands that could compromise the system.

d. Local-First Execution

This system runs as a pure Go binary with zero runtime dependencies. There is no Python environment to configure, no Node.js runtime to maintain, no virtual machine to manage. The agent executes in the full environment of the host machine, with access to all system resources, environment variables, and installed tooling.

This local-first approach eliminates virtualization overhead and provides maximum performance. The agent can invoke the same build tools, linters, and test runners that developers use daily. There is no abstraction layer between the model's intent and the system's actual capabilities.

e. MCP as Universal Bridge

The Model Context Protocol (MCP) is not a closed ecosystem — it is an interoperability hub. This system treats MCP as a bridge to external tool ecosystems, enabling the agent to discover and utilize capabilities beyond its built-in set.

MCP servers can be configured to provide specialized tooling: database access, API integrations, cloud service management, or domain-specific analyzers. The agent treats MCP tools the same as built-in tools — with full permission enforcement and the same execution guarantees.

Agent Loop: State Machine

┌──────────┐  tool_use  ┌──────────────┐
│ REQUEST  │───────────▶│ TOOL_EXEC    │
│          │◀───────────│              │
└────┬─────┘  result    └──────┬───────┘
     │                         │
end_turn│                 error│
     ▼                         ▼
┌──────────┐           ┌──────────────┐
│ TERMINATE│           │ CONTINUE     │
└──────────┘           └──────────────┘

State	Transition	Condition
REQUEST	→ TOOL_EXEC	Model invokes tool (tool_use)
REQUEST	→ TERMINATE	Task complete (end_turn)
TOOL_EXEC	→ CONTINUE	Tool succeeded, continue loop
TOOL_EXEC	→ CONTINUE	Tool failed with recoverable error
TOOL_EXEC	→ TERMINATE	Non-recoverable failure

Design Decisions

Decision	Rationale	Trade-off
Go Implementation	Static linking, single-binary deployment, predictable memory usage	Less flexible than interpreted languages
Three-Tier Permission	Binary allow/deny insufficient for varied development workflows	More complex configuration, enables practical use
Glob Rule Matching	Fine-grained control over which files can be modified	Requires users to understand glob patterns
250-Character Tool Descriptions	Forces discipline in capability communication; prevents context bloat	May omit useful tool details
MAX_TURNS = 50	Prevents resource exhaustion from infinite loops	Complex tasks may need more iterations
Three-Tier Message Compaction	Preserves semantic content while managing token limits	Compression is lossy; some context may be lost
MCP as External Adapter	Protocol-agnostic tool discovery; ecosystem integration	MCP servers add external dependencies

Target Audience

This documentation is written for:

AI Agent Architecture Researchers — examining how harness design influences agent capabilities
Senior Engineers evaluating Harness-First engineering patterns for production systems
Tech Leads seeking reference implementations for production-grade AI tooling

The emphasis throughout is on why rather than what — the code reveals implementation details, but these pages explain the reasoning behind them.

Design Philosophy — Deep dive into the intelligence/reliability separation
Agent Loop — State machine mechanics and history management
Tool System — Built-in tools and extension mechanisms
Configuration Guide — Permission policies and model settings

Architecture Overview ​

The Defining Principle ​

Five Architectural Advantages ​

a. Harness-First Engineering ​

b. Context Window as Scarce Resource ​

c. Layered Permission Defense ​

d. Local-First Execution ​

e. MCP as Universal Bridge ​

Agent Loop: State Machine ​

Design Decisions ​

Target Audience ​

Related Documentation ​