Documentation Index
Fetch the complete documentation index at: https://docs.knoxcall.com/llms.txt
Use this file to discover all available pages before exploring further.
AI Gateway
The AI Gateway is KnoxCall’s AI-native proxy layer. It sits between your applications and upstream AI providers (Anthropic, OpenAI, Bedrock, Ollama), providing:
- Phantom tokens — short-lived DPoP-bound capability tokens; your application never holds a raw provider API key.
- Streaming PII redaction — Aho-Corasick + Presidio detector stack with a configurable hold-back buffer.
- Prompt injection firewall — built-in heuristic patterns + tenant-supplied regex/keyword rules; optional canary token injection detects system-prompt extraction.
- Budget enforcement — per-agent daily/monthly USD caps with configurable overage actions.
- Shape translation — automatic Anthropic ↔ OpenAI request/response translation when routes target a different provider format.
- Compliance packs — HIPAA Safe Harbor, GDPR, PCI-DSS, and SOC 2 packs install recognizers and audit-log alert rules in one operation.
Authentication
All AI Gateway proxy endpoints authenticate via phantom tokens — capability tokens minted for a specific agent. Include the token as a bearer token:
Authorization: Bearer kc_live_a_...
For DPoP-bound tokens, you must additionally include a valid DPoP header on every request.
Admin endpoints (gateway/agent CRUD, token minting) use your standard KnoxCall API key in the Authorization: Bearer tk_live_... header.
Execute AI request
POST /v1/ai/{agent-slug}/{...path}
Forwards the request to the agent’s configured primary route. The request body must be a valid JSON body for the upstream provider (Anthropic Messages API, OpenAI Chat Completions API, etc.).
If the agent has streaming_enabled and the request includes Accept: text/event-stream or "stream": true in the body, the response is streamed as SSE.
| Header | Description |
|---|
Authorization: Bearer kc_live_a_... | Phantom token for this agent. |
DPoP | DPoP proof JWT (required when the token has dpop_required: true). |
X-KC-User | Optional. SCIM user identifier for per-user cost attribution. |
X-KC-Conversation-Id | Optional. Conversation identifier for PII token-map scoping across turns. |
| Header | Description |
|---|
X-Request-Id | UUID identifying this proxy call (appears in audit logs). |
X-Knox-AI-Budget-Pct | Current budget utilization percentage (when budget is configured). |
X-Knox-AI-Budget-Warning | Present when utilization exceeds the warn threshold. |
X-Knox-AI-Tools-Stripped | Comma-separated names of tools removed by the agent’s tool allowlist. |
X-Knox-AI-Output-Warning | Present when the response fails output schema validation but output_validation_action is warn. |
X-Knox-AI-Retry | Set to output_schema when the response was automatically retried due to schema violation. |
Firewall responses
When the prompt injection firewall blocks a request:
{
"error": "firewall_block",
"error_description": "Request blocked by firewall policy: ignore_previous_instructions, dan_jailbreak"
}
HTTP 400. The firewall always runs regardless of whether a policy is attached (built-in heuristics cannot be disabled).
Shape translation
When the caller’s provider format differs from the route’s target provider format, the AI Gateway translates automatically:
| From | To | What changes |
|---|
| Anthropic | OpenAI | system field → system message; input_schema → parameters; tool choice shape |
| OpenAI | Anthropic | Leading system message → system field; parameters → input_schema |
| Any | Bedrock | Strips model from body; adds anthropic_version: "bedrock-2023-05-31" |
Format is detected from the route’s target_base_url (e.g. api.anthropic.com → Anthropic format). Response shapes are translated back to match the caller’s expected format on the buffered path.
Canary token injection
When a firewall policy has canary_enabled: true, the gateway injects a kc_canary_<16hex> token into every system prompt before forwarding upstream. If the model echoes the token verbatim in its response (a sign of system-prompt extraction via prompt injection), the gateway:
- Emits an
ai_gateway.canary_leak audit log entry at critical severity.
- Sets
firewall_outcome = "warn" on the usage record.
- Fires any compliance-pack alert rules listening on
audit_action = 'ai_gateway.canary_leak'.
Compliance packs
Install compliance packs via the Packs API or from the Admin UI. Each pack installs:
- PII recognizers — entity-type patterns for the detector stack.
- Alert rules —
ai_gateway_event alerts that fire when specific audit actions appear.
Currently available packs: hipaa-safe-harbor, gdpr, pci-dss, soc2.
OpenTelemetry export
Set OTEL_EXPORTER_OTLP_ENDPOINT to enable per-request span export with GenAI semantic conventions:
| Attribute | Value |
|---|
gen_ai.system | Provider (e.g. anthropic) |
gen_ai.request.model | Requested model |
gen_ai.usage.input_tokens | Input token count |
gen_ai.usage.output_tokens | Output token count |
knoxcall.ai_gateway.firewall_outcome | pass / warn / block / tag |
knoxcall.ai_gateway.pii_tokens_swapped | Count of PII redactions |
knoxcall.ai_gateway.cost_usd | Estimated cost in USD |