Structured Errors for AI Agents: The 6 Fields Every Error Response Needs

Standard HTTP error responses were designed for human developers reading documentation and writing integration code. An experienced developer sees a 422 with {"error": "invalid_input", "field": "voice_id"}, understands what failed, looks up the valid voice IDs, and fixes the code.

An AI agent sees the same response and has to infer: Is this something I can retry? Should I try different parameters? Should I call a different endpoint first? Should I escalate to a human? HTTP semantics give it the status code. The minimal error body gives it almost nothing else.

We went through two iterations of WFW's error format before landing on something that actually works for agent consumers. This post is about the six fields we ended up with, and specifically why each one exists.

The 6 Fields

// lib/errors/types.ts

/**
 * The WFW v2 API error envelope.
 * Every non-2xx response body has this shape.
 */
export interface WFWErrorEnvelope {
  error: {
    /**
     * Machine-readable error code — stable across API versions.
     * Agents should branch on this, not on the message string.
     * Examples: "voice_not_found", "rate_limit_exceeded", "insufficient_scopes"
     */
    code: string;

    /**
     * Human-readable description of what went wrong.
     * Suitable for logging and surfacing to developers.
     * NOT suitable for user-facing display without sanitization.
     */
    message: string;

    /**
     * Whether the agent should retry this request.
     * true  → transient failure, retry with backoff
     * false → persistent failure, retrying won't help without changes
     *
     * This is the most important field for agent behavior.
     * A 500 with retryable: true is a DB blip — retry.
     * A 500 with retryable: false is a misconfiguration — page someone.
     */
    retryable: boolean;

    /**
     * Seconds to wait before retrying. Present on 429 responses.
     * When present, the agent MUST wait at least this long.
     * Derived from the rate limit reset timestamp.
     */
    retry_after_seconds?: number;

    /**
     * Specific, actionable recovery steps.
     * These are instructions for the agent, not descriptions of the error.
     * "Add events:read scope" is an action. "Insufficient permissions" is not.
     */
    recovery_actions: string[];

    /**
     * URL to the relevant documentation section.
     * Included on complex errors where the recovery actions
     * aren't self-contained.
     */
    documentation_url?: string;
  };
}

Field by Field: Why Each One Exists

`code` — Branch on this, not `message`

Error codes are stable identifiers that agents can use in conditional logic. Error messages change — we improve wording, add context, localize. If an agent branches on the message string "Voice ID not found" and we change it to "The specified voice_id does not exist in this account", the branch breaks silently.

Agents should always branch on code. The message is for humans and logs.

// What an agent should do
if (error.code === "voice_not_found") {
  // Call GET /v2/voices to list available voices, retry with valid ID
} else if (error.code === "insufficient_scopes") {
  // Check the required scopes in error.recovery_actions, escalate to admin
}

`retryable` — The Single Most Important Field

Without retryable, an agent receiving a 500 has no way to distinguish:

A transient database connection timeout (retry in 10 seconds, will succeed)
A permanent misconfiguration where the agent's service account has no client association (retrying forever won't help)

These require completely different responses. The first: retry silently with exponential backoff. The second: stop, fire an alert, wait for a human to fix the configuration.

We set retryable based on the error category:

Status	Scenario	`retryable`
429	Rate limit	`true`
500	DB timeout, external service blip	`true`
500	Config error, missing env var	`false`
422	Validation error	`false`
404	Resource not found	`false`
401	Invalid token	`false`
403	Insufficient scopes	`false` (fix scopes first)

The asymmetry on 500s is important. Not all 500s are retryable. We know at throw time whether an error is transient or fundamental, so we set retryable at the source rather than having agents infer it from the status code.

`recovery_actions` — Actions, Not Descriptions

This field exists because we discovered that AI agents reason much better from explicit instructions than from error descriptions. Compare:

Before:

{
  "error": {
    "code": "insufficient_scopes",
    "message": "The service account does not have the required permissions.",
    "retryable": false
  }
}

After:

{
  "error": {
    "code": "insufficient_scopes",
    "message": "The service account does not have the required permissions.",
    "retryable": false,
    "recovery_actions": [
      "Add the 'agents:write' scope to service account sa_abc123 in the WFW admin dashboard under Settings → Service Accounts",
      "Or create a new service account with agents:write scope and update the WORKFORCE_WAVE_API_KEY environment variable",
      "Required scopes for this endpoint: agents:write, voices:read"
    ]
  }
}

The "before" version tells the agent what failed in abstract terms. The "after" version tells it what to do next, in what order, and where. An agent handling the "after" response can either execute those steps (if it has admin access) or relay them precisely to a human operator.

Every recovery_actions entry should be a sentence that starts with an imperative verb: "Add...", "Update...", "Check...", "Retry...". Descriptions masquerading as actions ("The voice ID may not exist") are not recovery actions.

`retry_after_seconds` — Rate Limits Done Right

On 429 responses, we include both the retry_after_seconds field in the error body and the X-RateLimit-Reset header (Unix timestamp). Agents get two ways to know when to retry.

Why both? The header is the standard HTTP mechanism that any HTTP client can read automatically. The error body field is for agents that parse JSON responses — they can act on it without parsing headers. Either works; having both means the agent's HTTP library and the agent's reasoning layer both have the information.

// 429 response example
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit of 100 requests/minute exceeded for client client_abc123.",
    "retryable": true,
    "retry_after_seconds": 47,
    "recovery_actions": [
      "Wait 47 seconds before retrying this request",
      "Consider batching multiple agent creates using POST /v2/agents/batch to reduce request count",
      "Your current rate limit tier: 100 req/min. Contact support to upgrade."
    ]
  }
}

The retry_after_seconds is exact — we calculate it from the rate limit window reset time, not a static number. An agent can trust it.

The `v2Error()` Helper

We don't construct error responses by hand in route handlers. Every non-2xx response goes through a v2Error() helper:

// lib/errors/v2-error.ts

interface V2ErrorOptions {
  code: string;
  message: string;
  retryable: boolean;
  retryAfterSeconds?: number;
  recoveryActions?: string[];
  documentationUrl?: string;
  httpStatus?: number;
}

/**
 * Construct a WFW v2 API error response.
 * This is the only way route handlers should create error responses —
 * ensures every error has the required fields and consistent shape.
 */
export function v2Error(opts: V2ErrorOptions): Response {
  const body: WFWErrorEnvelope = {
    error: {
      code: opts.code,
      message: opts.message,
      retryable: opts.retryable,
      retry_after_seconds: opts.retryAfterSeconds,
      recovery_actions: opts.recoveryActions ?? [],
      documentation_url: opts.documentationUrl,
    },
  };

  const response = Response.json(body, {
    status: opts.httpStatus ?? 400,
  });

  // Always include these headers on errors
  response.headers.set("Content-Type", "application/json");

  // Include rate limit headers on 429s
  if (opts.retryAfterSeconds !== undefined) {
    response.headers.set("Retry-After", String(opts.retryAfterSeconds));
    const resetAt = Math.floor(Date.now() / 1000) + opts.retryAfterSeconds;
    response.headers.set("X-RateLimit-Reset", String(resetAt));
  }

  return response;
}

// Usage in a route handler:
export async function POST(req: NextRequest) {
  const voice = await db.query.voices.findFirst({
    where: eq(voices.id, body.voice_id),
  });

  if (!voice) {
    return v2Error({
      code: "voice_not_found",
      message: `Voice ID '${body.voice_id}' does not exist in this account.`,
      retryable: false,
      httpStatus: 422,
      recoveryActions: [
        "Call GET /v2/voices to list available voice IDs for your account",
        "Pass a valid voice_id from that list and retry",
      ],
      documentationUrl: "https://docs.workforcewave.com/api/voices",
    });
  }
  // ...
}

Before and After: What Agents Actually Do

Here's a concrete illustration of how structured errors change agent behavior.

Before structured errors — agent receives a 422:

Agent reasoning: "The request failed with status 422. The error says 
'invalid_input'. I don't know which field is invalid or what valid 
values are. I'll try modifying the request and retrying."
→ Agent generates a plausible-but-wrong voice_id like "voice_default"
→ Another 422
→ Agent escalates after 3 attempts, losing context

After structured errors — agent receives a 422 with full envelope:

Agent reasoning: "The request failed with code 'voice_not_found', 
retryable: false. Recovery actions say to call GET /v2/voices 
to list available voice IDs. I'll do that first."
→ Agent calls GET /v2/voices
→ Gets back a list of valid voice IDs with descriptions
→ Selects the appropriate voice, retries createAgent
→ Success

The structured error doesn't just report a failure. It gives the agent a recovery plan. In the "before" case, the agent is guessing. In the "after" case, it's executing a known-good procedure.

This is what we mean when we say error responses for AI-native APIs need to be better than the error responses you'd write for human developers. Humans will read the docs and figure it out. Agents will execute exactly what you tell them in the error response. Make it actionable.

Structured Errors for AI Agents: The 6 Fields Every Error Response Needs

The 6 Fields

Field by Field: Why Each One Exists

`code` — Branch on this, not `message`

`retryable` — The Single Most Important Field

`recovery_actions` — Actions, Not Descriptions

`retry_after_seconds` — Rate Limits Done Right

The `v2Error()` Helper

Before and After: What Agents Actually Do

Related Articles

What Artera Got Right (And What's Still Missing)

Workforce Wave AI: The Engine Behind Auto-Provisioning

The Bot Creation Matrix: Four Ways to Deploy AI, Now All Live on WFW

Structured Errors for AI Agents: The 6 Fields Every Error Response Needs

The 6 Fields

Field by Field: Why Each One Exists

code — Branch on this, not message

retryable — The Single Most Important Field

recovery_actions — Actions, Not Descriptions

retry_after_seconds — Rate Limits Done Right

The v2Error() Helper

Before and After: What Agents Actually Do

Related Articles

What Artera Got Right (And What's Still Missing)

Workforce Wave AI: The Engine Behind Auto-Provisioning

The Bot Creation Matrix: Four Ways to Deploy AI, Now All Live on WFW

`code` — Branch on this, not `message`

`retryable` — The Single Most Important Field

`recovery_actions` — Actions, Not Descriptions

`retry_after_seconds` — Rate Limits Done Right

The `v2Error()` Helper