OpenHands Sentinel — Real-Time Code Quality Guardian for AI Agents

Quick Start

Three integration paths. Zero configuration.

① As an OpenHands Hook

openhands_config.py

from openhands_sentinel import Sentinel
from agent_vcr import VCRRecorder

recorder = VCRRecorder()
sentinel = Sentinel(recorder=recorder)

# Attach to any OpenHands runtime
sentinel.attach(runtime.event_stream)
# Done. Every FileWrite is now guarded.

② Standalone CLI

terminal

# Install
$ pip install ai-agent-vcr

# Scan any directory
$ sentinel scan ./my-project

# With strict thresholds
$ sentinel scan ./src --max-complexity 6

# Output: per-file report + audit .vcr

Architecture

Four components. One feedback loop. Zero external calls.

agent_vcr/integrations/

🔌 VCRRuntime

Hooks into the OpenHands EventStream. Converts FileWrite + FileEdit events into analysis targets. Includes oversized-frame guardrails (issue #7402).

→

openhands_sentinel/

🛡️ Sentinel

The orchestrator. Receives file events, dispatches to CodeAnalyzer, decides severity, generates agent warnings, and records VCR frames.

→

openhands_sentinel/

🔬 CodeAnalyzer

Python AST-based analysis engine. Trajectory-aware: tracks function definitions across all files in the session to detect cross-file duplicates.

→

agent_vcr/

📼 VCRRecorder

Records every analysis result as a VCR frame. Creates a time-travel-debuggable audit trail of the agent's quality trajectory.

↻

The loop: Event → Analyze → Warn → Agent Self-Corrects → Event → Analyze → Clean ✓

How It Works

End-to-end flow of a single agent session with Sentinel.

🤖

Agent Action

FileWrite event

→

🔌

VCRRuntime

Intercepts event

→

🛡️

Sentinel

Dispatches analysis

→

🔬

CodeAnalyzer

AST inspection

→

📼

VCR Frame

Recorded to .vcr/

sentinel_session_output.log

━━━━━━━━━ SENTINEL SESSION ━━━━━━━━━

▶ STEP 1  Agent writes auth/utils.py
  🛡️ Analyzing auth/utils.py...
  Functions found: hash_password, verify_token, generate_salt
  ✓ CLEAN — No violations detected
  📼 Recorded as frame #1

▶ STEP 2  Agent writes api/handlers.py
  🛡️ Analyzing api/handlers.py...
  Functions found: handle_auth_request, hash_password

  ╔══ VIOLATIONS DETECTED ══════════════════════════╗
  ║ CRITICAL  hash_password() duplicates auth/utils.py:8
  ║ CRITICAL  handle_auth_request() is 109 lines (max: 40)
  ║ CRITICAL  Cyclomatic complexity: 32 (max: 8)
  ║ WARNING   Parameter count: 9 (max: 5)
  ╚═════════════════════════════════════════════════╝

  → Injecting MessageAction into EventStream...
  → Agent acknowledges warning
  📼 Recorded as frame #2 (with violations)

▶ STEP 3  Agent rewrites api/handlers.py
  🛡️ Re-analyzing api/handlers.py...
  Functions found: handle_auth_request  (imports hash_password)

  ✓ CLEAN — All 4 violations resolved!
  📼 Recorded as frame #3

━━━━━━━━━ SESSION COMPLETE ━━━━━━━━━
📼 Full audit trail: .vcr/sentinel-session.vcr
📊 Quality: 4 violations caught → 4 self-corrected → 0 remaining

What Sentinel Catches

Five classes of code quality violations, all detected via Python's built-in AST module.

CRITICAL

Duplicate Functions

Cross-file detection. When an agent defines hash_password() in a new file but it already exists in auth/utils.py, Sentinel catches it instantly. Trajectory-aware — tracks all definitions across the session.

CRITICAL

Complexity Spikes

Cyclomatic complexity analysis via AST branch counting. Default threshold: 8. A function with 32 nested if/else/for/while/try paths gets flagged before the agent compounds the complexity.

CRITICAL

Monolithic Functions

Detects functions that exceed the line-length threshold (default: 40 lines). AI agents routinely generate 100+ line handlers. Sentinel enforces decomposition.

WARNING

Parameter Overload

Functions with more than 5 parameters (configurable). A symptom of god-functions that try to do everything. Forces the agent to use config objects or decompose.

WARNING

File Growth

Detects when a file grows by more than 50% in a single write. Prevents the "append-only" pattern where agents keep adding to the same file instead of creating new modules.

INFO

Frame Size Guardrails

VCRRuntime detects oversized recording frames (the OpenHands issue #7402 pattern) and warns before they pollute the audit trail. Keeps recordings fast and lean.

API Reference

Three classes. Minimal surface area. Maximum impact.

CodeAnalyzer(config=None)

The AST-based analysis engine. Stateful — maintains a cross-file function registry for duplicate detection.

config.max_function_length — Max lines per function (default: 40)
config.max_complexity — Max cyclomatic complexity (default: 8)
config.max_params — Max parameters per function (default: 5)
config.max_growth_pct — Max file growth % (default: 50)

analyzer.analyze_code(filepath, content)

Parses Python source, runs all checks, returns AnalysisResult with violations list and file metrics.

analyzer.register_file(filepath, content)

Registers function definitions from a file into the cross-file registry. Called automatically on first analysis.

Sentinel(recorder=None, config=None)

The orchestrator. Connects CodeAnalyzer to VCRRecorder and the OpenHands EventStream.

recorder — VCRRecorder instance for audit trail
config — AnalysisConfig for thresholds

sentinel.attach(event_stream)

Subscribes to an OpenHands EventStream. Automatically intercepts FileWrite and FileEdit actions.

sentinel.analyze_file(filepath, content)

Manually trigger analysis on a specific file. Returns AnalysisResult.

sentinel.generate_warning(result)

Generates a formatted warning string suitable for injecting as a MessageAction into the agent's stream.

CLI Reference

Run Sentinel standalone from any terminal.

terminal

# Basic scan
$ sentinel scan ./src
🛡️ Scanning ./src ...
  auth/utils.py         CLEAN ✓
  api/handlers.py       3 violations
  models/user.py        CLEAN ✓
  services/payment.py   1 warning

Summary: 4 files scanned | 3 critical | 1 warning
📼 Audit trail: .vcr/sentinel-scan.vcr

# Custom thresholds
$ sentinel scan ./src \
    --max-complexity 6 \
    --max-length 30 \
    --max-params 4

# JSON output for CI pipelines
$ sentinel scan ./src --format json | jq '.violations'

Why Not Just Use a Critic LLM?

The OpenHands team is building a verification stack with critic models. Here's why Sentinel complements — not replaces — that approach.

Metric	Critic LLM	Sentinel
Latency per check	2-8 seconds	<2ms
Cost per check	$0.003 - $0.02	$0.00
Requires internet	Yes (API calls)	No
Deterministic	No (LLM variance)	Yes (AST-based)
Catches duplicates	Sometimes (limited context)	Always (trajectory-aware)
Catches logical bugs	Yes (LLM reasoning)	No (structural only)
Catches security issues	Yes (if prompted)	No (structural only)
Dependencies	API key + network	Python stdlib only
Enterprise-ready	Data leaves your infra	100% air-gapped

💡 The sweet spot

Run Sentinel as the first pass — instant, free, deterministic structural checks on every write. Use critic LLMs as a second pass for semantic/logical review on high-risk changes. Sentinel catches the low-hanging fruit at zero cost, so the critic model only runs on code that's already structurally sound.

File Map

Where everything lives in the codebase.

project structure

agent-vcr/
├── src/
│   ├── openhands_sentinel/           # NEW — The Sentinel package
│   │   ├── __init__.py               # Exports: Sentinel, CodeAnalyzer
│   │   ├── analyzer.py               # AST engine (314 lines)
│   │   ├── sentinel.py               # Orchestrator (325 lines)
│   │   ├── cli.py                    # CLI entry point
│   │   └── py.typed                  # PEP 561 marker
│   │
│   └── agent_vcr/
│       └── integrations/
│           ├── openhands.py          # Existing ACIDWorkspace
│           └── openhands_hook.py     # NEW — VCRRuntime hook
│
├── examples/
│   └── sentinel_demo.py              # Self-contained demo
│
├── docs/
│   ├── index.html                    # Main docs (updated)
│   └── sentinel/
│       └── index.html                # ← You are here
│
└── pyproject.toml                    # v0.6.0, sentinel CLI registered

Your agent's code
quality guardian.

Quick Start

① As an OpenHands Hook

② Standalone CLI

Architecture

🔌 VCRRuntime

🛡️ Sentinel

🔬 CodeAnalyzer

📼 VCRRecorder

How It Works

Agent Action

VCRRuntime

Sentinel

CodeAnalyzer

VCR Frame

What Sentinel Catches

Duplicate Functions

Complexity Spikes

Monolithic Functions

Parameter Overload

File Growth

Frame Size Guardrails

API Reference

CodeAnalyzer(config=None)

analyzer.analyze_code(filepath, content)

analyzer.register_file(filepath, content)

Sentinel(recorder=None, config=None)

sentinel.attach(event_stream)

sentinel.analyze_file(filepath, content)

sentinel.generate_warning(result)

CLI Reference

Why Not Just Use a Critic LLM?

💡 The sweet spot

File Map

Ready to guard your agent's code?

Your agent's code quality guardian.

Quick Start

① As an OpenHands Hook

② Standalone CLI

Architecture

🔌 VCRRuntime

🛡️ Sentinel

🔬 CodeAnalyzer

📼 VCRRecorder

How It Works

Agent Action

VCRRuntime

Sentinel

CodeAnalyzer

VCR Frame

What Sentinel Catches

Duplicate Functions

Complexity Spikes

Monolithic Functions

Parameter Overload

File Growth

Frame Size Guardrails

API Reference

CodeAnalyzer(config=None)

analyzer.analyze_code(filepath, content)

analyzer.register_file(filepath, content)

Sentinel(recorder=None, config=None)

sentinel.attach(event_stream)

sentinel.analyze_file(filepath, content)

sentinel.generate_warning(result)

CLI Reference

Why Not Just Use a Critic LLM?

💡 The sweet spot

File Map

Ready to guard your agent's code?

Your agent's code
quality guardian.