AI Voice Agents

Building a Multi-Platform Voice Tool Gateway: Normalizing ElevenLabs, Vapi, and Retell

Workforce Wave

March 24, 20269 min read
#architecture#elevenlabs#retell#vapi#voice-ai

When you integrate with one voice AI platform, you build for its format. When you integrate with three, you have a choice: handle each format in every tool endpoint, or build one normalization layer and handle all formats once.

We chose the normalization layer. It took longer to build initially, but when we added Retell as a supported platform, exactly zero lines of tool endpoint code changed. That's the payoff.

This post is about the adapter pattern we use to normalize tool calls from ElevenLabs, Vapi, and Retell into a single internal ToolCallEvent type.

The Problem: Three Formats, One Codebase

Every platform has its own wire format for tool calls. Here's the same logical call — "book an appointment for customer 12345 on Tuesday at 2pm" — in three different shapes:

ElevenLabs:

{
  "tool_name": "book_appointment",
  "parameters": {
    "customer_id": "12345",
    "date": "2026-03-24",
    "time": "14:00"
  },
  "call_id": "call_el_abc123",
  "tool_call_id": "tc_xyz789"
}

Vapi:

{
  "message": {
    "type": "tool-calls",
    "toolCallList": [{
      "id": "tc_xyz789",
      "type": "function",
      "function": {
        "name": "book_appointment",
        "arguments": "{\"customer_id\":\"12345\",\"date\":\"2026-03-24\",\"time\":\"14:00\"}"
      }
    }],
    "call": {
      "id": "call_vapi_abc123",
      "assistantId": "asst_abc"
    }
  }
}

Retell:

{
  "event": "tool_call",
  "name": "book_appointment",
  "args": {
    "customer_id": "12345",
    "date": "2026-03-24",
    "time": "14:00"
  },
  "call_id": "call_ret_abc123",
  "tool_call_id": "tc_xyz789"
}

Note that Vapi JSON-encodes the arguments as a string (following the OpenAI function calling spec). ElevenLabs and Retell pass the parameters as an already-parsed object. If you handle these in your tool handlers directly, you're writing parsing and normalization logic in every endpoint.

The Normalized Event Type

We define one internal type that every tool handler works with:

// lib/voice/tool-gateway/types.ts

/** The platform that originated this tool call */
export type VoicePlatform = "elevenlabs" | "vapi" | "retell" | "synthflow";

/**
 * Normalized tool call event — the single internal representation
 * regardless of which voice platform sent it.
 */
export interface ToolCallEvent {
  /** Unique ID for this tool call instance (for idempotency + logging) */
  toolCallId: string;
  /** The call session this tool call belongs to */
  callId: string;
  /** Name of the tool being invoked (e.g. "book_appointment") */
  toolName: string;
  /** Parsed, typed parameters — never a JSON string */
  parameters: Record<string, unknown>;
  /** Which platform sent this */
  platform: VoicePlatform;
  /** Raw original payload, preserved for debugging */
  rawPayload: unknown;
  /** Unix ms when this was received */
  receivedAt: number;
}

/**
 * The normalized response that tool handlers return.
 * Adapters translate this back into platform-specific format.
 */
export interface ToolCallResult {
  toolCallId: string;
  /** The data to return to the voice agent */
  result: unknown;
  /** If set, the agent will read this aloud or use it as context */
  message?: string;
  /** If true, the agent should end the call after this tool call */
  shouldHangUp?: boolean;
}

Every tool handler receives a ToolCallEvent and returns a ToolCallResult. Neither type has anything platform-specific in it.

The Platform Adapters

Each platform gets an inbound adapter (raw payload → ToolCallEvent) and an outbound adapter (ToolCallResult → platform response format).

// lib/voice/tool-gateway/adapters/elevenlabs.ts

import type { ToolCallEvent, ToolCallResult } from "../types";

interface ElevenLabsToolCallPayload {
  tool_name: string;
  parameters: Record<string, unknown>;
  call_id: string;
  tool_call_id: string;
}

/**
 * Normalize an ElevenLabs tool call webhook into our internal event type.
 * ElevenLabs passes parameters as an already-parsed object, so no
 * JSON.parse needed — just restructure.
 */
export function fromElevenLabs(payload: ElevenLabsToolCallPayload): ToolCallEvent {
  return {
    toolCallId: payload.tool_call_id,
    callId: payload.call_id,
    toolName: payload.tool_name,
    parameters: payload.parameters,
    platform: "elevenlabs",
    rawPayload: payload,
    receivedAt: Date.now(),
  };
}

/**
 * Transform our internal result back into the format ElevenLabs expects.
 * ElevenLabs expects: { tool_call_id, output }
 */
export function toElevenLabs(result: ToolCallResult): Record<string, unknown> {
  return {
    tool_call_id: result.toolCallId,
    output: result.message ?? JSON.stringify(result.result),
  };
}
// lib/voice/tool-gateway/adapters/vapi.ts

import type { ToolCallEvent, ToolCallResult } from "../types";

interface VapiToolCallPayload {
  message: {
    type: string;
    toolCallList: Array<{
      id: string;
      type: "function";
      function: {
        name: string;
        arguments: string; // Note: JSON-encoded string, not parsed object
      };
    }>;
    call: {
      id: string;
      assistantId: string;
    };
  };
}

/**
 * Normalize a Vapi tool call webhook.
 * Key difference from ElevenLabs: Vapi follows the OpenAI function calling
 * spec, where `arguments` is a JSON-encoded string, not a parsed object.
 * We parse it here so downstream handlers never see the string form.
 */
export function fromVapi(payload: VapiToolCallPayload): ToolCallEvent {
  // Vapi can send multiple tool calls in one webhook — we normalize to the first.
  // For multi-call batches, the gateway calls this per tool call.
  const toolCall = payload.message.toolCallList[0];

  // Parse the JSON-string arguments — this is the key difference from ElevenLabs
  let parameters: Record<string, unknown>;
  try {
    parameters = JSON.parse(toolCall.function.arguments);
  } catch {
    // Malformed arguments string — log and pass empty params
    console.error("Vapi: failed to parse tool call arguments", toolCall.function.arguments);
    parameters = {};
  }

  return {
    toolCallId: toolCall.id,
    callId: payload.message.call.id,
    toolName: toolCall.function.name,
    parameters,
    platform: "vapi",
    rawPayload: payload,
    receivedAt: Date.now(),
  };
}

/**
 * Transform our internal result back into the Vapi response format.
 * Vapi expects a `results` array matching the toolCallList order.
 */
export function toVapi(result: ToolCallResult): Record<string, unknown> {
  return {
    results: [{
      toolCallId: result.toolCallId,
      result: result.message ?? JSON.stringify(result.result),
    }],
  };
}

The Gateway Dispatcher

The adapters get called from the gateway, which sits between the webhook route and the tool handlers:

// lib/voice/tool-gateway/gateway.ts

import { fromElevenLabs, toElevenLabs } from "./adapters/elevenlabs";
import { fromVapi, toVapi } from "./adapters/vapi";
import { fromRetell, toRetell } from "./adapters/retell";
import { enforceComplianceRules } from "./compliance";
import { dispatchToolCall } from "./dispatch";
import type { VoicePlatform, ToolCallEvent } from "./types";

/**
 * Main entry point for the tool gateway.
 * Normalizes the platform payload, enforces compliance rules,
 * dispatches to the appropriate tool handler, and serializes the response.
 */
export async function handleToolCall(
  platform: VoicePlatform,
  rawPayload: unknown,
  actorContext: ActorContext
): Promise<{ status: number; body: unknown }> {

  // Step 1: Normalize the platform payload into our internal event type
  let event: ToolCallEvent;
  switch (platform) {
    case "elevenlabs":
      event = fromElevenLabs(rawPayload as ElevenLabsToolCallPayload);
      break;
    case "vapi":
      event = fromVapi(rawPayload as VapiToolCallPayload);
      break;
    case "retell":
      event = fromRetell(rawPayload as RetellToolCallPayload);
      break;
    default:
      return { status: 400, body: { error: `Unknown platform: ${platform}` } };
  }

  // Step 2: Enforce compliance rules before dispatch.
  // This runs regardless of which platform the call came from —
  // compliance is evaluated against the normalized event, not the raw payload.
  const complianceResult = await enforceComplianceRules(event, actorContext);
  if (!complianceResult.allowed) {
    // Return a platform-appropriate blocked response
    const blockedResult = {
      toolCallId: event.toolCallId,
      result: null,
      message: complianceResult.blockedMessage,
    };
    return { status: 200, body: serializeResult(platform, blockedResult) };
  }

  // Step 3: Dispatch to the tool handler
  const result = await dispatchToolCall(event, actorContext);

  // Step 4: Serialize back to the platform's expected response format
  return {
    status: 200,
    body: serializeResult(platform, result),
  };
}

/** Serialize a ToolCallResult back into the platform-specific format */
function serializeResult(platform: VoicePlatform, result: ToolCallResult): unknown {
  switch (platform) {
    case "elevenlabs": return toElevenLabs(result);
    case "vapi":       return toVapi(result);
    case "retell":     return toRetell(result);
    default:           return result;
  }
}

ComplianceRules in the Gateway

One of the most important benefits of centralizing the gateway: compliance rule enforcement is platform-agnostic. We check rules like "don't book appointments outside business hours" or "don't process payment data from unverified callers" once, in the gateway, against the normalized event.

Before the gateway, compliance checks were scattered across individual tool handlers. Some handlers had them, some didn't. The gateway made the compliance layer mandatory — if you add a new tool, it automatically inherits compliance enforcement because it flows through the gateway.

Adding a New Platform

When we added Retell, the changeset was:

  • lib/voice/tool-gateway/adapters/retell.ts — new file, ~50 lines
  • lib/voice/tool-gateway/gateway.ts — add retell to the switch statements
  • app/api/v2/webhooks/retell/route.ts — new route handler that calls handleToolCall("retell", ...)

The bookappointment, checkavailability, updatecustomerrecord, and every other tool handler — unchanged. They receive ToolCallEvent objects. They don't know what platform sent them.

The Tradeoff We Accepted

The adapter layer adds a layer of indirection and a small amount of code that maps between formats. If you're only integrating with one platform, that indirection has no payoff — it's just more code.

We accepted this complexity because tool endpoints change far more often than platform integrations. We ship new tools or modify existing tools weekly. We change platform integrations maybe twice a year. Pushing the complexity to the stable layer (adapters) and keeping the volatile layer (tool handlers) clean is the right tradeoff.

The other thing we accepted: we own the VoicePlatform type. When a platform changes its wire format (Vapi has done this once), we update the adapter and the rest of the system is unaffected. That's the contract the gateway enforces.

If you're building a voice AI platform that integrates with multiple providers — or that you want to be able to swap — the adapter pattern is worth the upfront investment. The break-even point is roughly two platforms and three tool handlers. After that, the gateway pays for itself every time you change a tool.

Share this article

Ready to put AI voice agents to work in your business?

Get a Live Demo — It's Free