Metacognitive monitoring for AI self-evolution. Answers the question: is the agent actually growing, or just going through the motions?
The echo ecosystem gives an AI agent structured self-evolution — memory (recall-echo), a document pipeline (praxis-echo), daily reflection sessions. But none of these tools evaluate whether what the agent produces is genuine.
A reflection can use the right vocabulary, reach a conclusion, and update the right documents while being entirely mechanical — restating existing beliefs in new words without genuine engagement. praxis-echo ensures the pipeline flows. vigil-echo ensures what flows through is real.
The design constraint comes from metacognition research: even in humans, self-reported metacognitive monitoring is unreliable (Dunlosky & Metcalfe, 2009; Mazancieux et al., 2025). If humans can't reliably assess their own thinking quality, an AI agent shouldn't depend on self-assessment either. vigil-echo measures external behavioral signals from actual output — not self-reported quality.
vigil-echo extracts a signal vector from the agent's identity documents after each session, detects trends over a rolling window, and injects a cognitive health assessment at the start of the next session.
┌──────────────────────────┐
│ Agent's Session Output │
│ │
│ REFLECTIONS.md updated │
│ THOUGHTS.md evolved │
│ CURIOSITY.md new Qs │
└────────────┬─────────────┘
│
▼
┌──────────────────────────┐
│ vigil-echo collect │ ◄── SessionEnd hook
│ │
│ Extract 4 signals: │
│ vocabulary diversity │
│ question generation │
│ thought lifecycle │
│ evidence grounding │
│ │
│ Store → signals.json │
│ Analyze → analysis.json │
└──────────────────────────┘
─── next session starts ───
┌──────────────────────────┐
│ vigil-echo pulse │ ◄── PreToolUse hook
│ │
│ Read analysis.json │
│ Compute alert level │
│ Inject assessment: │
│ │
│ [VIGIL — Cognitive Health]│
│ Overall: HEALTHY │
│ 2 improving, 2 stable │
│ [END VIGIL] │
└──────────────────────────┘
Four signals computable with hand-rolled string processing — no NLP, no embeddings, no external APIs.
| Signal | What It Measures | How |
|---|---|---|
| vocabulary_diversity | Lexical variety in reflections | Type-token ratio across REFLECTIONS.md content |
| question_generation | Active curiosity | Count of open questions in CURIOSITY.md |
| thought_lifecycle | Thought turnover health | (graduated + dissolved) / total ratio in THOUGHTS.md |
| evidence_grounding | Concrete reference density | Entries with dates, attributions, or sources / total entries |
All signals are normalized to 0.0–1.0 (except question_generation, which is a count). Higher values generally indicate healthier thinking.
For each signal with 3+ data points:
recent = mean(last 3 vectors)
baseline = mean(previous 7 vectors)
delta = recent - baseline
DECLINING if delta < threshold
IMPROVING if delta > threshold
STABLE otherwise
| Level | Condition | Action |
|---|---|---|
| HEALTHY | All signals stable or improving | Brief status line |
| WATCH | 1-2 signals declining | Specific nudge per signal |
| CONCERN | 3+ signals declining | Detailed assessment |
| ALERT | Sustained decline across 7+ sessions | Flag for human review |
cargo install vigil-echogit clone https://github.com/dnacenta/vigil-echo.git
cd vigil-echo
cargo install --path .Initialize the metacognitive monitoring system. Creates the state directory, writes protocol rules, deploys default configuration, and registers hooks. Idempotent — safe to run multiple times.
vigil-echo init
│
├──▶ Create ~/.claude/vigil/
├──▶ Write config.json (default thresholds)
├──▶ Write signals.json (empty history)
├──▶ Deploy ~/.claude/rules/vigil-echo.md
└──▶ Register hooks in settings.json
PreToolUse → vigil-echo pulse
SessionEnd → vigil-echo collect
Extract a signal vector from the current state of identity documents. Appends to signal history, runs trend analysis, and updates analysis.json.
vigil-echo collect # default: manual trigger
vigil-echo collect --trigger session-end # called by SessionEnd hookvigil-echo collect
│
├──▶ Read REFLECTIONS.md, THOUGHTS.md, CURIOSITY.md
├──▶ Compute 4 signals
├──▶ Append vector to signals.json
├──▶ Run trend analysis → analysis.json
└──▶ Print signal summary
✓ Collected signal vector (session-end)
vocabulary_diversity: 0.44
question_generation: 6.00
thought_lifecycle: 0.00
evidence_grounding: 0.73
Inject a cognitive health assessment at session start. Reads the latest analysis and formats it for context injection. Includes a 60-second cooldown to avoid re-running within the same session.
[VIGIL — Cognitive Health]
Overall: HEALTHY | 2 improving, 2 stable, 0 declining
Highlight: evidence grounding trending up (+0.12)
Watch: thought_lifecycle at 0.00 — thoughts accumulating without resolution
[END VIGIL]
Manually run trend analysis on signal history. Normally called automatically by collect.
vigil-echo analyze # default: 10-session window
vigil-echo analyze --window 20 # custom window sizeCognitive health dashboard. Shows signal history, latest values, trend directions, alert level, and configuration.
vigil-echo — cognitive health dashboard
Signal History
12 data points (max 50)
First: 2026-02-27T23:13:33
Latest: 2026-02-28T22:00:05
Latest Signals
vocabulary_diversity: 0.44
question_generation: 6.00
thought_lifecycle: 0.33
evidence_grounding: 0.73
Analysis
Status: HEALTHY
1 improving, 3 stable, 0 declining
Trends
↑ evidence grounding 0.73 (+0.12)
→ vocabulary diversity 0.44 (+0.01)
→ question generation 6.00 (+0.00)
→ thought lifecycle 0.33 (+0.03)
Config
Window size: 10
Max history: 50
Cooldown: 60s
~/.claude/
│
├── rules/
│ └── vigil-echo.md ············ Protocol rules (auto-loaded into every session)
│
├── vigil/
│ ├── signals.json ·············· Signal vector history (last 50 collections)
│ ├── analysis.json ············· Latest trend analysis + alert level
│ ├── config.json ··············· Thresholds and calibration settings
│ └── pulse-state.json ·········· Cooldown tracking
│
└── settings.json ················· Hooks: PreToolUse + SessionEnd
Default thresholds are in ~/.claude/vigil/config.json:
{
"thresholds": {
"vocabulary_diversity": { "decline": -0.05, "improve": 0.05 },
"evidence_grounding": { "decline": -0.10, "improve": 0.10 },
"question_generation": { "decline": -1.0, "improve": 1.0 },
"thought_lifecycle": { "decline": -0.10, "improve": 0.10 }
},
"window_size": 10,
"max_history": 50,
"alert_after_sessions": 7,
"cooldown_seconds": 60
}These are educated guesses. They need tuning after 2-3 weeks of real data. Adjust them based on your agent's actual signal ranges.
vigil-echo is part of the echo ecosystem — a set of composable tools for AI self-evolution:
| Module | Purpose | Repo |
|---|---|---|
| recall-echo | Persistent three-layer memory | |
| praxis-echo | Document pipeline enforcement | source |
| vigil-echo | Metacognitive monitoring | you are here |
| voice-echo | Voice interface (phone calls) | source |
| bridge-echo | HTTP bridge for Claude CLI | |
| discord-voice-echo | Discord voice channel sidecar | source |
All three monitoring tools coexist in the same settings.json hooks:
PreToolUse → recall-echo consume → praxis-echo pulse → vigil-echo pulse
PreCompact → recall-echo checkpoint → praxis-echo checkpoint
SessionEnd → recall-echo promote → praxis-echo review → vigil-echo collect
- 4 additional signals: conclusion_novelty, comfort_index, cross_pollination, position_delta
- N-gram comparison for novelty detection
- Threshold calibration from Phase 1 data
- Weekly
reportsubcommand
- Feedback loop into self-evolution workflow prompts
- Historical correlation: which inputs produce the most genuine thinking
- Discord alerts for CONCERN and ALERT levels
- Call-human trigger for sustained cognitive decline