Agent Harness

A generic, configurable harness for long-running autonomous coding agents. Built on the Claude Agent SDK, it implements Anthropic's guide for effective agent harnesses, featuring phase-driven execution, configurable MCP tools, and SDK-native sandbox isolation.

Features

Agent Harness provides:

Phase-based workflows — declarative phase definitions with conditions and run-once semantics
TOML-based configuration — no code changes needed to customize behavior
SDK-native security — OS sandbox with network isolation, declarative permission rules (allow/deny), secure defaults
Progress tracking — JSON checklist, notes file, or none with automatic completion detection
Error recovery — exponential backoff and circuit breaker to prevent runaway costs
MCP server support — browser automation, databases, etc.
Session persistence — auto-continue across sessions with state tracking
Setup verification — check auth, tools, config before running

Prerequisites

Python 3.10+
uv package manager

Getting Started

→ For a detailed step-by-step guide, see docs/setup-guide.md

1. Install

git clone <repo-url>
cd claude-agent-harness
uv tool install .

2. Set up authentication

Export one of these environment variables:

ANTHROPIC_API_KEY — get one from console.anthropic.com
CLAUDE_CODE_OAUTH_TOKEN — via claude setup-token

See .env.example for all options.

Using 1Password CLI

If you manage secrets with 1Password CLI, create a .env file with an op:// reference:

ANTHROPIC_API_KEY="op://Vault/Item/api_key"

Then wrap any command with op run:

op run --env-file "./.env" -- agent-harness run --project-dir ./my-project

3. Run

# Scaffold a new project configuration
agent-harness init --project-dir ./my-project

# Edit the configuration
#    -> ./my-project/.agent-harness/config.toml

# Verify setup
agent-harness verify --project-dir ./my-project

# Run the agent
agent-harness run --project-dir ./my-project

CLI Reference

agent-harness init --project-dir <path>     # Scaffold new config
agent-harness verify --project-dir <path>   # Check setup
agent-harness run --project-dir <path>      # Run the agent

# Info commands (for skill/automation integration)
agent-harness info template [--name NAME] [--list] [--all] [--json]
agent-harness info schema [--json]
agent-harness info preset [--name NAME] [--list] [--json]
agent-harness info guide [--json]

# Options
--project-dir PATH      # Agent's working directory (default: .)
--max-iterations N      # Override max iterations (run only)
--model MODEL           # Override model (run only)
--version              # Show version

Info Commands

The info subcommand provides programmatic access to templates, configuration schema, presets, and documentation. These commands are designed for integration with Claude Code skills and other automation tools:

info template — Get template file contents (config.toml, spec.md, init.md, build.md)
- Use --all to fetch all templates with content in one command
info schema — Get JSON schema describing all configuration options
info preset — Get preset configurations for common use cases (python, web-nodejs, read-only)
info guide — Get the complete setup guide as structured JSON or markdown

All commands support --json flag for machine-readable output.

Claude Code Skill

A Claude Code skill is available via the plugin system:

# Step 1: Add the marketplace
/plugin marketplace add <github-url-or-local-clone>

# Step 2: Install the plugin
/plugin install agent-harness@agent-harness-plugin

Once installed, invoke the skill in Claude Code:

/agent-harness                                  # Interactive mode selection
/agent-harness help me setup using ./spec.md    # Custom instruction

The skill provides:

Interactive interviews for project setup
Preset recommendations based on tech stack
Security-first configuration with clear trade-offs
Automated fixes for common configuration issues

How It Works

The harness executes agents in configurable phases with conditions and run-once semantics. Each phase gets a fresh Claude SDK session (no context carryover) with a configured prompt.

Phase execution:

Phases run sequentially based on conditions (exists:, not_exists: path checks)
run_once: true phases skip after first successful completion
State persists in .agent-harness/session.json

Session management:

Fresh context per session prevents context pollution
Progress preserved via tracking file (e.g., feature_list.json), session state, and git commits
Auto-continue after configured delay (default 3s)
Completion detection: Harness stops when tracker.is_complete() returns true (only json_checklist supports this; notes_file and none require manual stop via Ctrl+C)
Press Ctrl+C to pause; run same command to resume

Error recovery:

Prevents runaway API costs when sessions fail repeatedly:

Tracks consecutive errors across sessions
Exponential backoff: 5s → 10s → 20s → 40s → 80s (circuit breaker trips; max cap 120s)
Circuit breaker: Trips after 5 consecutive errors (configurable)
Successful session resets error counter
Error context forwarded to next session to help recovery

[error_recovery]
max_consecutive_errors = 5
initial_backoff_seconds = 5.0
max_backoff_seconds = 120.0
backoff_multiplier = 2.0

Security Model

This harness follows Anthropic's secure deployment recommendations by relying on the SDK's built-in sandbox and permission system as the primary defense, rather than custom application-layer validation.

SDK-Native Sandbox

The Claude SDK provides process-level isolation with:

Process isolation — Bash commands run in a sandboxed subprocess
Network restrictions — Configurable domain allowlist and Unix socket access
Filesystem boundaries — Commands are restricted to the project directory

[security.sandbox]
enabled = true
auto_allow_bash_if_sandboxed = true
allow_unsandboxed_commands = false  # secure default

[security.sandbox.network]
# Example allowed domains (configure for your project needs)
allowed_domains = ["registry.npmjs.org", "github.com"]
allow_local_binding = false
allow_unix_sockets = []

Declarative Permission Rules

Security is enforced through SDK permission rules, not runtime command parsing:

[security.permissions]
allow = [
    "Bash(npm *)", "Bash(node *)", "Bash(git *)",
    "Bash(ls *)", "Bash(cat *)", "Bash(grep *)",
    "Read(./**)", "Write(./**)", "Edit(./**)",
]
deny = [
    "Bash(curl *)", "Bash(wget *)",
    "Read(./.env)", "Read(./.env.*)",
]

Permission rules are evaluated by the SDK before tool execution. The agent cannot bypass these rules through prompt injection or indirect command execution.

Secure Defaults

allow_unsandboxed_commands defaults to false
When sandbox is enabled, auto_allow_bash_if_sandboxed=true auto-allows Bash commands
When sandbox is disabled, explicit permissions.allow rules are required
Network access is denied by default

Git Protection Recommendations

For production deployments, protect critical branches using server-side git hooks or branch protection rules on your git hosting platform (GitHub, GitLab, Bitbucket), not client-side validation. This prevents destructive operations like git push --force at the source.

Configuration

Configuration lives in .agent-harness/config.toml.

Directory Layout

project_dir/
├── .agent-harness/
│   ├── logs/                  # Session logs (auto-created, gitignored)
│   ├── config.toml            # Main configuration (required)
│   ├── spec.md                # Project specification
│   ├── session.json           # Session number, completed phases (auto-created)
│   └── prompts/               # Prompt files (referenced by config)
│       ├── init.md
│       └── build.md
└── (generated code lives here)

Configuration Reference

For a complete, annotated configuration reference with detailed comments on all available options, see:

agent_harness/templates/config.toml - Template with full documentation
examples/claude-ai-clone/.agent-harness/config.toml - Real-world example

The init command creates a new config using the template:

agent-harness init --project-dir ./my-project

Config Loading Precedence

CLI flags > config.toml values > defaults

Project Structure

claude-agent-harness/
├── agent_harness/          # Python package
│   ├── __init__.py
│   ├── __main__.py         # Entry point
│   ├── cli.py              # Argument parsing, subcommands
│   ├── client_factory.py   # Builds ClaudeSDKClient from config
│   ├── config.py           # Config loading, validation, HarnessConfig
│   ├── runner.py           # Generic agent loop
│   ├── tracking.py         # Progress tracking implementations
│   └── verify.py           # Setup verification checks
├── examples/
│   ├── claude-ai-clone/
│   │   ├── .agent-harness/
│   │   │   ├── config.toml
│   │   │   ├── spec.md
│   │   │   └── prompts/
│   │   │       ├── init.md
│   │   │       └── build.md
│   │   └── README.md
│   └── simple-calculator/
│       ├── .agent-harness/
│       │   ├── config.toml
│       │   ├── spec.md
│       │   └── prompts/
│       │       ├── init.md
│       │       └── build.md
│       └── README.md
├── tests/
│   ├── test_cli.py
│   ├── test_client_factory.py
│   ├── test_config.py
│   ├── test_prompts.py
│   ├── test_runner.py
│   ├── test_tracking.py
│   └── test_verify.py
├── .env.example
└── pyproject.toml

Examples

Claude.ai Clone (Next.js)

See examples/claude-ai-clone/ for a complete example that:

Uses Next.js/React stack (npm, node commands)
Integrates Puppeteer MCP server for browser testing
Generates a production-quality chat interface
Tracks progress via feature_list.json

# Run the Claude.ai clone example
mkdir -p ./my-clone-output
cp -r examples/claude-ai-clone/.agent-harness ./my-clone-output/
agent-harness run --project-dir ./my-clone-output

Simple Calculator (Python)

See examples/simple-calculator/ for a minimal example that:

Uses Python stdlib only (no external dependencies)
Completes in ~5 minutes (good for demos)
Shows basic two-phase workflow (init + build)
Tracks progress via feature_list.json

Troubleshooting

"Configuration file not found"

The harness expects a .agent-harness/config.toml file in your project directory. If you see this error:

Check that you're running from the correct directory
Use agent-harness init --project-dir ./my-project to scaffold a new configuration

"Prompt file not found"

Check that all file: references in your config.toml point to files relative to the .agent-harness/ directory:

[[phases]]
prompt = "file:prompts/coding_prompt.md"  # Must exist at .agent-harness/prompts/coding_prompt.md

"Neither ANTHROPIC_API_KEY nor CLAUDE_CODE_OAUTH_TOKEN is set"

You need authentication credentials to use the Claude API:

API Key: Get one from console.anthropic.com and set export ANTHROPIC_API_KEY="your-key"
OAuth Token: Run claude setup-token and the harness will use CLAUDE_CODE_OAUTH_TOKEN automatically
See .env.example for setting these via environment file

Agent is hanging on the first session

The first session can take 10-20+ minutes for complex projects as it reads the spec, plans features, creates project structure, and sets up git. This is expected behavior. Subsequent sessions are typically faster.

If a session truly hangs:

Check .agent-harness/session.json for error messages
Look for permission prompts or security blocks in the output
Verify your progress file format matches the configuration (e.g., feature_list.json with "passes": false fields)

Development

# Install dependencies
uv sync

# Run all tests
uv run python -m unittest discover tests -v

# Run specific test modules
uv run python -m unittest tests.test_config -v         # Configuration loading
uv run python -m unittest tests.test_tracking -v       # Progress tracking
uv run python -m unittest tests.test_runner -v         # Session loop logic
uv run python -m unittest tests.test_client_factory -v # Client creation

Test coverage includes:

Security configuration: Sandbox settings, permission rules, network isolation
Configuration loading: TOML parsing, defaults, validation, error cases
Progress tracking: Completion detection, JSON parsing, print formatting
Prompt loading: File reading, file: resolution, error handling

License

MIT License. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Harness

Features

Prerequisites

Getting Started

1. Install

2. Set up authentication

3. Run

CLI Reference

Info Commands

Claude Code Skill

How It Works

Security Model

SDK-Native Sandbox

Declarative Permission Rules

Secure Defaults

Git Protection Recommendations

Configuration

Directory Layout

Configuration Reference

Config Loading Precedence

Project Structure

Examples

Claude.ai Clone (Next.js)

Simple Calculator (Python)

Troubleshooting

"Configuration file not found"

"Prompt file not found"

"Neither ANTHROPIC_API_KEY nor CLAUDE_CODE_OAUTH_TOKEN is set"

Agent is hanging on the first session

Development

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.claude-plugin		.claude-plugin
agent_harness		agent_harness
docs		docs
examples		examples
plugin		plugin
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
uv.lock		uv.lock

License

cpplain/claude-agent-harness

Folders and files

Latest commit

History

Repository files navigation

Agent Harness

Features

Prerequisites

Getting Started

1. Install

2. Set up authentication

3. Run

CLI Reference

Info Commands

Claude Code Skill

How It Works

Security Model

SDK-Native Sandbox

Declarative Permission Rules

Secure Defaults

Git Protection Recommendations

Configuration

Directory Layout

Configuration Reference

Config Loading Precedence

Project Structure

Examples

Claude.ai Clone (Next.js)

Simple Calculator (Python)

Troubleshooting

"Configuration file not found"

"Prompt file not found"

"Neither ANTHROPIC_API_KEY nor CLAUDE_CODE_OAUTH_TOKEN is set"

Agent is hanging on the first session

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages