Candidate Flow
Live Interview
Real-time AI interview via WebSocket audio streaming — architecture and candidate experience.
The live interview is a real-time audio conversation between the candidate and an AI interviewer powered by the Agno Interview Agent. Audio streams bidirectionally through the Vert.x WebSocket edge — candidate voice is transcribed, processed, and responded to in near real-time.
Page: /interview/:sessionId
The interview page connects to the Vert.x WebSocket edge on mount, streams the candidate's microphone audio, and plays back TTS audio responses from the AI agent.
| UI Element | Description |
|---|---|
| AI avatar | Animated agent indicator (speaking / listening state) |
| Candidate video | Local camera preview (candidate sees themselves) |
| Transcript panel | Real-time conversation transcript (optional, recruiter configurable) |
| Audio level meter | Visual indicator of microphone input level |
| Mute button | Toggle microphone (does not end session) |
| End interview | Terminate session and publish interview.completed event |
WebSocket Connection
// Connect to Vert.x edge after start call
const wsUrl = sessionStorage.getItem('interview_ws_url') ?? 'ws://localhost:8080';
const sessionId = params.sessionId;
const tenantId = sessionStorage.getItem('interview_tenant_id');
const ws = new WebSocket(
`${wsUrl}?sessionId=${sessionId}&tenantId=${tenantId}`
);
ws.onopen = () => {
// Start streaming microphone audio
startAudioCapture(ws);
};
ws.onmessage = (event) => {
// Receive TTS audio blob from AI agent
playAudioBlob(event.data);
};Audio Format
| Direction | Format | Details |
|---|---|---|
| Candidate → Agent | 16kHz PCM | Raw PCM audio chunks, ~100ms each |
| Agent → Candidate | MP3 / OGG | TTS audio blob, played via HTMLAudioElement |
| Video (optional) | WebM chunks | Kafka: video.candidate.stream (30s segments) |
Audio Pipeline
Candidate Browser
│
├── MediaStream (getUserMedia)
│ └── AudioContext → ScriptProcessorNode → 16kHz PCM
│
├── WebSocket → Vert.x Edge (ws://localhost:8080)
│ │
│ └── Buffers chunks, forwards to Agno Interview Agent (HTTP)
│ │
│ ├── Whisper transcription → candidate text
│ ├── LLM (Claude/GPT-4) → agent response text
│ └── TTS (ElevenLabs/OpenAI TTS) → audio blob
│
└── WebSocket ← Vert.x Edge ← Agent audio blob
│
└── HTMLAudioElement.play() → candidate hears AI responseInterview Completion
When the AI agent determines the interview is complete (all plan sections covered), or the candidate clicks End Interview, the Vert.x edge publishes the interview.completed Kafka event, which triggers the Assessor Agent.
| Trigger | Who | State After |
|---|---|---|
| All plan sections covered | Agno Interview Agent | COMPLETED |
| Candidate ends session | Vert.x (via WebSocket close) | COMPLETED |
| Admin cancels | API Gateway | CANCELLED |
| Session timeout (2h) | Vert.x (timer) | COMPLETED |
Reconnection
If the WebSocket disconnects (network interruption), the frontend attempts to reconnect using exponential backoff. The session remains active on the server for 10 minutes after disconnect.
function reconnect(attempt: number) {
const delay = Math.min(1000 * Math.pow(2, attempt), 30000);
setTimeout(() => {
ws = new WebSocket(wsUrl);
ws.onopen = () => { /* resume session */ };
ws.onerror = () => reconnect(attempt + 1);
}, delay);
}