AI Gateway

The AI Gateway is KnoxCall’s AI-native proxy layer. It sits between your applications and upstream AI providers (Anthropic and OpenAI are the configurable providers; Bedrock is supported only as a shape-translation target), providing:

Phantom tokens — short-lived DPoP-bound capability tokens; your application never holds a raw provider API key.
Streaming PII redaction — Aho-Corasick + Presidio detector stack with a configurable hold-back buffer.
Prompt injection firewall — built-in heuristic patterns + tenant-supplied regex/keyword rules; optional canary token injection detects system-prompt extraction.
Budget enforcement — per-agent daily/monthly USD caps with configurable overage actions.
Shape translation — automatic Anthropic ↔ OpenAI request/response translation when routes target a different provider format.
Compliance packs — HIPAA Safe Harbor, GDPR, PCI-DSS, and SOC 2 packs install recognizers and audit-log alert rules in one operation.

Authentication

All AI Gateway proxy endpoints authenticate via phantom tokens — capability tokens minted for a specific agent. Send the token via any of the three accepted header schemes:

Authorization: Bearer kc_live_a_...

x-api-key: kc_live_a_...

x-knox-ai-key: kc_live_a_...

For DPoP-bound tokens, you must additionally include a valid DPoP header on every request. Admin endpoints (gateway/agent CRUD, token minting) are part of the dashboard/control plane on the admin host. They require a session-authenticated request (session JWT) with the X-Tenant-ID header and an owner/admin role — not a tk_live_... API key.

Execute AI request

POST https://{slug}.knoxcall.com/v1/ai/{agent-slug}/{...path}

The execute (data-plane) endpoint is served on your tenant’s proxy subdomain — https://{slug}.knoxcall.com (sandbox: https://sandbox-{slug}.knoxcall.com), not api.knoxcall.com. Use the agent_url returned when the agent was created as the base URL. Forwards the request to the agent’s configured primary route. The request body must be a valid JSON body for the upstream provider (Anthropic Messages API, OpenAI Chat Completions API, etc.). If the agent has streaming_enabled and the request includes Accept: text/event-stream or "stream": true in the body, the response is streamed as SSE.

Request headers

Header	Description
`Authorization: Bearer kc_live_a_...`	Phantom token for this agent. May alternatively be sent via the `x-api-key` or `x-knox-ai-key` header.
`DPoP`	DPoP proof JWT (required when the token has `dpop_required: true`).
`X-KC-User`	Optional. SCIM user identifier for per-user cost attribution.
`X-KC-Conversation-Id`	Optional. Conversation identifier for PII token-map scoping across turns.

Response headers

Header	Description
`X-Request-Id`	UUID identifying this proxy call (appears in audit logs).
`X-Knox-AI-Budget-Pct`	Current budget utilization percentage (when budget is configured).
`X-Knox-AI-Budget-Warning`	Present when utilization exceeds the warn threshold.
`X-Knox-AI-Tools-Stripped`	Comma-separated names of tools removed by the agent’s tool allowlist.
`X-Knox-AI-Output-Warning`	Present when the response fails output schema validation but `output_validation_action` is `warn`.
`X-Knox-AI-Retry`	Set to `output_schema` when the response was automatically retried due to schema violation.

Firewall responses

When the prompt injection firewall blocks a request:

{
  "error": "firewall_block",
  "error_description": "Request blocked by firewall policy: ignore_previous_instructions, dan_jailbreak"
}

HTTP 400. The firewall always runs regardless of whether a policy is attached (built-in heuristics cannot be disabled).

Shape translation

When the caller’s provider format differs from the route’s target provider format, the AI Gateway translates automatically:

From	To	What changes
Anthropic	OpenAI	`system` field → system message; `input_schema` → `parameters`; tool choice shape
OpenAI	Anthropic	Leading system message → `system` field; `parameters` → `input_schema`
Any	Bedrock	Strips `model` from body; adds `anthropic_version: "bedrock-2023-05-31"`

Format is detected from the route’s target_base_url (e.g. api.anthropic.com → Anthropic format). Response shapes are translated back to match the caller’s expected format on the buffered path.

Canary token injection

When a firewall policy has canary_enabled: true, the gateway injects a kc_canary_<16hex> token into every system prompt before forwarding upstream. If the model echoes the token verbatim in its response (a sign of system-prompt extraction via prompt injection), the gateway:

Emits an ai_gateway.canary_leak audit log entry at critical severity.
Sets firewall_outcome = "warn" on the usage record.
Fires any compliance-pack alert rules listening on audit_action = 'ai_gateway.canary_leak'.

Compliance packs

Install compliance packs via the Packs API or from the Admin UI. Each pack installs:

PII recognizers — entity-type patterns for the detector stack.
Alert rules — ai_gateway_event alerts that fire when specific audit actions appear.

Currently available packs: hipaa-safe-harbor, gdpr, pci-dss, soc2.

OpenTelemetry export

Set OTEL_EXPORTER_OTLP_ENDPOINT to enable per-request span export with GenAI semantic conventions:

Attribute	Value
`gen_ai.system`	Provider (e.g. `anthropic`)
`gen_ai.request.model`	Requested model
`gen_ai.usage.input_tokens`	Input token count
`gen_ai.usage.output_tokens`	Output token count
`knoxcall.ai_gateway.firewall_outcome`	`pass` / `warn` / `block` / `tag`
`knoxcall.ai_gateway.pii_tokens_swapped`	Count of PII redactions
`knoxcall.ai_gateway.cost_usd`	Estimated cost in USD

Overview

Routes

Secrets

Clients

Environments

API Keys

Webhooks

Ephemeral Proxy

AI Gateway

Audit Logs

Account

Tenant KMS

Secret Store Migrations

Execute AI Request

AI Gateway

Authentication

Execute AI request

Request headers

Response headers

Firewall responses

Shape translation

Canary token injection

Compliance packs

OpenTelemetry export

​AI Gateway

​Authentication

​Execute AI request

​Request headers

​Response headers

​Firewall responses

​Shape translation

​Canary token injection

​Compliance packs

​OpenTelemetry export

AI Gateway

Authentication

Execute AI request

Request headers

Response headers

Firewall responses

Shape translation

Canary token injection

Compliance packs

OpenTelemetry export