Rate Limiting

Protect your backend APIs from abuse, excessive load, and DDoS attacks with KnoxCall’s intelligent rate limiting system.

What is Rate Limiting?

Rate limiting controls how many requests a client can make within a specific time window. It prevents:

❌ API abuse - Malicious users making excessive requests
❌ Accidental overload - Buggy code creating infinite loops
❌ DDoS attacks - Distributed denial of service attempts
❌ Cost overruns - Preventing excessive API usage costs

How Rate Limiting Works

Client makes request
    ↓
Check request count for this client
    ↓
Within limit? → ✅ Forward request → Increment counter
    ↓
Exceeded limit? → ❌ Return 429 Too Many Requests

Rate limit counters reset based on your configured window:

Per minute: Resets every 60 seconds
Per hour: Resets every hour
Per day: Resets at midnight UTC

Configuration Levels

KnoxCall supports rate limiting at multiple levels:

1. Route-Level Limits

Apply to all clients using a route:

Route: stripe-webhooks
Limit: 10,000 requests/hour

ALL clients combined cannot exceed 10k requests/hour

Use case: Protect your backend from total overload

2. Client-Level Limits

Apply to individual clients:

Client: mobile-app-ios
Limit: 1,000 requests/hour

This specific client cannot exceed 1k requests/hour

Use case: Fair usage per client/application

3. Method-Specific Limits

Different limits per HTTP method:

GET requests: 5,000/hour
POST requests: 1,000/hour
DELETE requests: 100/hour

Use case: Restrict write operations more than reads

Setting Up Rate Limits

Route-Level Rate Limiting

Navigate to Routes → Select your route
Scroll to Rate Limiting section
Toggle Enable Rate Limiting to ON
Configure limits:

Request Limit:

Time Window:

hour

Options: minute, hour, day Burst Allowance (Optional):

Allows temporary spikes above the base limit.

Click Save

Client-Level Rate Limiting

Navigate to Clients → Select your client
Scroll to Rate Limiting section
Configure limits:

Per-Client Limit:

500 requests/hour

Applies to: All routes this client can access

Method-Specific Rate Limiting

Edit your route
Go to Method Configurations tab
For each HTTP method, set individual limits:

GET:
  Rate Limit: 5000/hour

POST:
  Rate Limit: 1000/hour

DELETE:
  Rate Limit: 100/hour

Rate Limit Response

When a client exceeds the limit, they receive: HTTP Status:

429 Too Many Requests

Response Headers:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640000000
Retry-After: 3600

Response Body:

{
  "error": "Rate limit exceeded",
  "message": "You have exceeded your rate limit of 1000 requests per hour",
  "limit": 1000,
  "remaining": 0,
  "reset_at": "2025-01-20T15:00:00Z",
  "retry_after_seconds": 3600
}

Checking Rate Limit Status

Clients can check their current status via response headers on every request:

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1640000000

Headers:

X-RateLimit-Limit: Total requests allowed in window
X-RateLimit-Remaining: Requests remaining in current window
X-RateLimit-Reset: Unix timestamp when limit resets

Burst Protection

Handle temporary traffic spikes without blocking legitimate users: Configuration:

Base Limit: 1,000 requests/hour
Burst Allowance: 200 requests

How it works:

Client can make up to 1,200 requests in a short burst
After burst, limited to 1,000 requests/hour average
Prevents legitimate spikes from being blocked

Example:

Minute 1: 200 requests ✅ (burst)
Minute 2: 200 requests ✅ (burst)
Minute 3: 200 requests ✅ (burst)
Minute 4: 200 requests ✅ (burst)
Minute 5: 200 requests ✅ (burst)
Minute 6: 200 requests ❌ (burst depleted)
Remaining hour: ~17 requests/minute average

Advanced Strategies

Per-User Rate Limiting

Use different limits based on user tiers:

Free Tier Client:
  Limit: 100 requests/hour

Pro Tier Client:
  Limit: 1,000 requests/hour

Enterprise Client:
  Limit: 10,000 requests/hour

Create separate clients for each tier.

Geographic Rate Limiting

Combine with IP whitelisting:

US Region: 5,000 requests/hour
EU Region: 3,000 requests/hour
APAC Region: 2,000 requests/hour

Create region-specific clients with different limits.

Time-Based Rate Limiting

Different limits for peak vs off-peak: Peak Hours (9 AM - 5 PM):

Limit: 500 requests/hour

Off-Peak:

Limit: 2,000 requests/hour

This requires creating separate routes or using API-based dynamic configuration.

Rate Limit Monitoring

View Rate Limit Events

Navigate to Logs → API Logs
Filter by status code: 429
See which clients are hitting limits

Set Up Alerts

Get notified when clients hit rate limits:

Navigate to Alerts → Add Alert
Select Rate Limit Exceeded
Configure:

Alert Type: Rate Limit Exceeded
Threshold: 10 violations/hour
Channels: Email, Slack

Analytics Dashboard

Monitor rate limit metrics:

Hit rate: % of requests that are rate-limited
Top offenders: Clients hitting limits most often
Trend analysis: Rate limit violations over time

Best Practices

1. Start Conservative

Begin with strict limits and relax based on usage:

Initial: 100 requests/hour
After monitoring: 500 requests/hour
Production stable: 1,000 requests/hour

2. Use Tiered Limits

Different limits for different client types:

Public API: 100/hour
Partner API: 1,000/hour
Internal Services: 10,000/hour

3. Enable Burst Protection

Allow temporary spikes:

Base: 1,000/hour
Burst: +20% (1,200 total)

4. Monitor and Adjust

Check rate limit logs weekly
Adjust limits based on legitimate usage
Set alerts for unusual patterns

5. Communicate Limits

Document your rate limits for API consumers:

## Rate Limits

- **Free Tier**: 100 requests/hour
- **Pro Tier**: 1,000 requests/hour
- **Enterprise**: Custom limits

Headers included in every response.

Common Configurations

Webhook Endpoint

Limit: 10,000 requests/hour
Burst: 500 requests
Reason: Webhooks can spike during events

Public API

Limit: 100 requests/hour per API key
Burst: 20 requests
Reason: Prevent abuse of public endpoints

Internal Microservices

Limit: 50,000 requests/hour
Burst: 5,000 requests
Reason: High-traffic internal communication

Payment Processing

POST /payments: 10 requests/minute
GET /payments: 100 requests/minute
Reason: Prevent duplicate payment charges

Handling Rate Limits (Client-Side)

Exponential Backoff

When receiving 429, implement retry logic:

async function makeRequestWithRetry(url, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url);

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After');
      const waitTime = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, i) * 1000;

      console.log(`Rate limited. Waiting ${waitTime}ms before retry...`);
      await new Promise(resolve => setTimeout(resolve, waitTime));
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

Check Headers Proactively

const response = await fetch(url);

const limit = response.headers.get('X-RateLimit-Limit');
const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');

if (remaining < 10) {
  console.warn(`Low on rate limit: ${remaining}/${limit} remaining`);
}

Request Queuing

Prevent hitting limits by queuing requests:

class RateLimitedQueue {
  constructor(maxRequestsPerHour) {
    this.maxRequests = maxRequestsPerHour;
    this.queue = [];
    this.requestTimestamps = [];
  }

  async enqueue(requestFn) {
    // Remove timestamps older than 1 hour
    const oneHourAgo = Date.now() - 3600000;
    this.requestTimestamps = this.requestTimestamps.filter(t => t > oneHourAgo);

    // Wait if at limit
    while (this.requestTimestamps.length >= this.maxRequests) {
      const oldestRequest = this.requestTimestamps[0];
      const waitTime = oldestRequest + 3600000 - Date.now();
      await new Promise(resolve => setTimeout(resolve, waitTime));
      this.requestTimestamps.shift();
    }

    this.requestTimestamps.push(Date.now());
    return await requestFn();
  }
}

Troubleshooting

High False Positive Rate

Problem: Legitimate users hitting limits Solutions:

Increase burst allowance
Raise base limits
Use per-user instead of per-IP limits

DDoS Still Getting Through

Problem: Rate limits not preventing attacks Solutions:

Lower limits for unknown clients
Enable request signing
Use IP-based blocking
Contact support for enterprise DDoS protection

Inconsistent Limit Enforcement

Problem: Some requests bypass rate limits Check:

Rate limits enabled on all routes
No conflicting client configurations
Limits applied at correct level (route vs client)

Next Steps

Request Signing

Add cryptographic signatures for extra security

Alerts

Get notified of rate limit violations

Analytics

Monitor rate limit metrics

Client Management

Set up per-client limits

📊 Statistics

Level: intermediate
Time: 15 minutes

🏷️ Tags

rate-limiting, security, ddos, api-protection

Getting Started

Essentials: Routes

Essentials: Clients

Essentials: Secrets

Essentials: Environments

Webhooks

Workflows

Security

Monitoring

Advanced Features

Infrastructure

Account

Troubleshooting

Support

About

​Rate Limiting

​What is Rate Limiting?

​How Rate Limiting Works

​Configuration Levels

​1. Route-Level Limits

​2. Client-Level Limits

​3. Method-Specific Limits

​Setting Up Rate Limits

​Route-Level Rate Limiting

​Client-Level Rate Limiting

​Method-Specific Rate Limiting

​Rate Limit Response

​Checking Rate Limit Status

​Burst Protection

​Advanced Strategies

​Per-User Rate Limiting

​Geographic Rate Limiting

​Time-Based Rate Limiting

​Rate Limit Monitoring

​View Rate Limit Events

​Set Up Alerts

​Analytics Dashboard

​Best Practices

​1. Start Conservative

​2. Use Tiered Limits

​3. Enable Burst Protection

​4. Monitor and Adjust

​5. Communicate Limits

​Common Configurations

​Webhook Endpoint

​Public API

​Internal Microservices

​Payment Processing

​Handling Rate Limits (Client-Side)

​Exponential Backoff

​Check Headers Proactively

​Request Queuing

​Troubleshooting

​High False Positive Rate

​DDoS Still Getting Through

​Inconsistent Limit Enforcement

​Next Steps

Request Signing

Alerts

Analytics

Client Management

📊 Statistics

🏷️ Tags

Rate Limiting

What is Rate Limiting?

How Rate Limiting Works

Configuration Levels

1. Route-Level Limits

2. Client-Level Limits

3. Method-Specific Limits

Setting Up Rate Limits

Route-Level Rate Limiting

Client-Level Rate Limiting

Method-Specific Rate Limiting

Rate Limit Response

Checking Rate Limit Status

Burst Protection

Advanced Strategies

Per-User Rate Limiting

Geographic Rate Limiting

Time-Based Rate Limiting

Rate Limit Monitoring

View Rate Limit Events

Set Up Alerts

Analytics Dashboard

Best Practices

1. Start Conservative

2. Use Tiered Limits

3. Enable Burst Protection

4. Monitor and Adjust

5. Communicate Limits

Common Configurations

Webhook Endpoint

Public API

Internal Microservices

Payment Processing

Handling Rate Limits (Client-Side)

Exponential Backoff

Check Headers Proactively

Request Queuing

Troubleshooting

High False Positive Rate

DDoS Still Getting Through

Inconsistent Limit Enforcement

Next Steps