agentUpdated 10/9/2025

Custom VAD for Agent Barging

Human-like conversational flow by enabling the user to interrupt the AI agent at any time.

Minimal, primarily hosting costsSub-50ms interrupt time

The Problem

Need a highly sensitive system to detect when a human customer begins speaking ('barging in') while the AI agent is talking, to interrupt immediately.

Expected Outcome

Human-like conversational flow by enabling the user to interrupt the AI agent at any time.

Tool Chain

Implementation Steps

  1. 1

    Audio Stream Monitoring

    Monitor the customer's audio stream independently of the agent's output.

    LiveKit
  2. 2

    Voice Activity Detection

    A VAD service processes the stream to detect the exact start and end points of human speech.

  3. 3

    Agent Interruption Signal

    A signal is sent to Retell (or custom agent logic) to immediately stop TTS playback and switch to ASR.

    Retell AI
  4. 4

    Agent Response

    The agent processes the customer's interruption and generates a relevant response.

Alternatives

Step 3

Has robust built-in barging/interruption handling.

Cost Impact: N/A

Export Workflow

Coming Soon

Soon you’ll export this stack to Zapier, n8n, or a starter repo with presets (env vars, webhooks, rate limits).

Get export launch updates