v0.3.1 ยท Open Source ยท MIT License

Time-travel debugging
for AI agents.

Your agent failed on step 8 of 10. LangSmith shows you what happened.
Agent VCR lets you rewind, fix it, and resume from step 8.

<5ms overhead per step
JSONL git-friendly format
0 vendor lock-in
3 lines to integrate
The Problem

Debugging agents is a nightmare.

When your LangGraph or CrewAI agent fails on step 8 out of 10, existing tools only tell you what went wrong. To fix it, you re-run all 10 steps from scratch.

Every logic error costs minutes of wall time and dollars in wasted LLM tokens. At scale, this kills iteration speed entirely.

  • Re-run entire chain for every small fix
  • Can't inspect intermediate state without breakpoints
  • LangSmith / LangFuse are read-only observers
  • No way to test prompt changes mid-chain
The Solution

Rewind. Fix. Resume.

Agent VCR records your agent's complete state at every step into a local JSONL file. When something breaks, you jump straight to the failing frame.

Edit the state โ€” fix a bad prompt, inject corrected context, patch a tool output โ€” then resume execution forward from that exact point.

  • Jump to any frame instantly
  • Full state snapshot at every step
  • Edit state and resume mid-chain
  • Fork runs to compare variants

Quick Start

From zero to time-travel in under a minute.

01

Install

terminal
pip install ai-agent-vcr
02

Record your agent

agent.py
from agent_vcr import VCRRecorder

recorder = VCRRecorder()
recorder.start_session("debug_run")

# Your existing agent code โ€” unchanged
result = my_agent.run(query)

recorder.save()  # โ†’ .vcr/debug_run.vcr
03

Time-travel & fix

debug.py
from agent_vcr import VCRPlayer

player = VCRPlayer.load(".vcr/debug_run.vcr")

# Jump to the failing step
state = player.goto_frame(7)
print(state)          # Inspect full state

# Fix the bad state
state["prompt"] = "Corrected prompt"

# Resume from step 7 forward
player.resume(agent_callable, from_frame=7)

Everything you need to debug agents.

Built for the reality of multi-step agentic systems.

โฎ

Time Travel

Jump to any frame in a session instantly. Full input/output state snapshot at every node.

โœ

State Injection

Mutate the state at any frame โ€” fix a prompt, patch tool output, inject context โ€” then resume.

๐ŸŒฟ

Session Forking

Fork from any frame to create parallel runs. Compare how different fixes change downstream behavior.

๐Ÿ“ก

Live WebSocket Feed

Stream agent execution in real-time via the built-in FastAPI server. Watch every step as it happens.

๐Ÿ—‚

JSONL Storage

Sessions stored as plain JSONL. Human-readable, git-diffable, append-only, parseable line-by-line.

โšก

<5ms Overhead

P99 recording latency under 5ms. Benchmarked continuously in CI. Safe for production use.

๐Ÿ”„

Async Native

Full AsyncVCRRecorder and AsyncVCRPlayer. Zero blocking I/O, built for modern asyncio agents.

๐Ÿ–ฅ

Terminal TUI

Ship with a Textual TUI debugger. Run vcr in your terminal to browse sessions interactively.

๐Ÿ”Œ

Framework Agnostic

Native integrations for LangGraph and CrewAI. Decorator API for raw Python โ€” no framework required.

Integrations

Drop into any framework in one line.

langgraph_agent.py
from langgraph.graph import StateGraph
from agent_vcr import VCRRecorder
from agent_vcr.integrations.langgraph import VCRLangGraph

graph = StateGraph()
graph.add_node("planner", planner_node)
graph.add_node("coder", coder_node)

# Add VCR in one line
recorder = VCRRecorder()
graph = VCRLangGraph(recorder).wrap_graph(graph)

result = graph.invoke({"query": "Build a todo app"})
recorder.save()
crew_agent.py
from crewai import Crew
from agent_vcr import VCRRecorder
from agent_vcr.integrations.crewai import VCRCrewAI

recorder = VCRRecorder()
recorder.start_session("crew_run")

crew = Crew(agents=[researcher, writer], tasks=[...])

# Wrap and run โ€” recording is automatic
vcr_crew = VCRCrewAI(recorder)
result = vcr_crew.kickoff(crew)

recorder.save()
agent.py
from agent_vcr import VCRRecorder
from agent_vcr.integrations.langgraph import vcr_record

recorder = VCRRecorder()

# Decorate any function
@vcr_record(recorder, node_name="my_step")
def my_step(data: dict) -> dict:
    return process(data)

# Each call is automatically recorded
result = my_step({"key": "value"})
async_agent.py
from agent_vcr import AsyncVCRRecorder, AsyncVCRPlayer

recorder = AsyncVCRRecorder()
await recorder.start_session("async_run")

# Fully non-blocking recording
await recorder.record_step(
    node_name="fetch_context",
    input_state=query_state,
    output_state=result_state,
)

path = await recorder.save()

# Async time-travel
player = await AsyncVCRPlayer.load(path)
state = await player.goto_frame(3)

How does it compare?

Agent VCR is the only tool that lets you change what happened.

Feature Agent VCR LangSmith LangFuse Arize Phoenix
Record execution traces โœ“ โœ“ โœ“ โœ“
Time-travel to any step โœ“ โœ— โœ— โœ—
Edit state & resume โœ“ โœ— โœ— โœ—
Fork from any frame โœ“ โœ— โœ— โœ—
Self-hosted / local-first โœ“ Cloud only โœ“ โœ“
Git-friendly format โœ“ JSONL โœ— โœ— โœ—
Setup lines of code 3 ~15 ~10 ~10

API Reference

Minimal, predictable interfaces.

VCRRecorder Core
# Start a recording session
recorder.start_session(
    session_id: str = None,
    metadata: dict = None,
    tags: list[str] = None,
) -> Session

# Record one agent step
recorder.record_step(
    node_name: str,
    input_state: dict,
    output_state: dict,
    metadata: FrameMetadata = None,
) -> Frame

# Convenience recorders
recorder.record_llm_call(...)
recorder.record_tool_call(...)
recorder.record_error(...)

# Save & fork
recorder.save() -> Path
recorder.fork(from_frame: int) -> VCRRecorder
VCRPlayer Core
# Load a saved session
player = VCRPlayer.load(filepath: str)

# Time-travel
player.goto_frame(index: int) -> dict
player.get_frame(index: int) -> Frame

# Inspect
player.list_nodes() -> list[str]
player.get_errors() -> list[Frame]
player.compare_frames(a: int, b: int) -> dict

# Resume execution
player.resume(
    agent_callable: Callable,
    config: ResumeConfig,
) -> str

# Export
player.export_state(frame_index: int) -> dict
ResumeConfig Config
# Configure how replay works
ResumeConfig(
    from_frame: int,

    # Optional: override state before resume
    state_overrides: dict = {},

    # FORK: new session from this point
    # REPLAY: re-run same inputs
    # MOCK: use injected mock values
    mode: ResumeMode = FORK,

    # Skip specific nodes
    skip_nodes: list[str] = [],

    # Inject mocks for tool calls
    inject_mocks: dict = {},
)

Stop re-running your whole chain.

Install Agent VCR and start debugging from the exact frame that broke.

pip install ai-agent-vcr

MIT License ยท No signup required ยท Works offline