Skip to main content

Rate Limiting

Protect your backend APIs from abuse, excessive load, and DDoS attacks with KnoxCall’s intelligent rate limiting system.

What is Rate Limiting?

Rate limiting controls how many requests a client can make within a specific time window. It prevents:
  • API abuse - Malicious users making excessive requests
  • Accidental overload - Buggy code creating infinite loops
  • DDoS attacks - Distributed denial of service attempts
  • Cost overruns - Preventing excessive API usage costs

How Rate Limiting Works

Client makes request

Check request count for this client

Within limit? → ✅ Forward request → Increment counter

Exceeded limit? → ❌ Return 429 Too Many Requests
Rate limit counters reset based on your configured window:
  • Per minute: Resets every 60 seconds
  • Per hour: Resets every hour
  • Per day: Resets at midnight UTC

Configuration Levels

KnoxCall supports rate limiting at multiple levels:

1. Route-Level Limits

Apply to all clients using a route:
Route: stripe-webhooks
Limit: 10,000 requests/hour

ALL clients combined cannot exceed 10k requests/hour
Use case: Protect your backend from total overload

2. Client-Level Limits

Apply to individual clients:
Client: mobile-app-ios
Limit: 1,000 requests/hour

This specific client cannot exceed 1k requests/hour
Use case: Fair usage per client/application

3. Method-Specific Limits

Different limits per HTTP method:
GET requests: 5,000/hour
POST requests: 1,000/hour
DELETE requests: 100/hour
Use case: Restrict write operations more than reads

Setting Up Rate Limits

Route-Level Rate Limiting

  1. Navigate to Routes → Select your route
  2. Scroll to Rate Limiting section
  3. Toggle Enable Rate Limiting to ON
  4. Configure limits:
Request Limit:
1000
Time Window:
hour
Options: minute, hour, day Burst Allowance (Optional):
100
Allows temporary spikes above the base limit.
  1. Click Save

Client-Level Rate Limiting

  1. Navigate to Clients → Select your client
  2. Scroll to Rate Limiting section
  3. Configure limits:
Per-Client Limit:
500 requests/hour
Applies to: All routes this client can access

Method-Specific Rate Limiting

  1. Edit your route
  2. Go to Method Configurations tab
  3. For each HTTP method, set individual limits:
GET:
  Rate Limit: 5000/hour

POST:
  Rate Limit: 1000/hour

DELETE:
  Rate Limit: 100/hour

Rate Limit Response

When a client exceeds the limit, they receive: HTTP Status:
429 Too Many Requests
Response Headers:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640000000
Retry-After: 3600
Response Body:
{
  "error": "Rate limit exceeded",
  "message": "You have exceeded your rate limit of 1000 requests per hour",
  "limit": 1000,
  "remaining": 0,
  "reset_at": "2025-01-20T15:00:00Z",
  "retry_after_seconds": 3600
}

Checking Rate Limit Status

Clients can check their current status via response headers on every request:
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1640000000
Headers:
  • X-RateLimit-Limit: Total requests allowed in window
  • X-RateLimit-Remaining: Requests remaining in current window
  • X-RateLimit-Reset: Unix timestamp when limit resets

Burst Protection

Handle temporary traffic spikes without blocking legitimate users: Configuration:
Base Limit: 1,000 requests/hour
Burst Allowance: 200 requests
How it works:
  • Client can make up to 1,200 requests in a short burst
  • After burst, limited to 1,000 requests/hour average
  • Prevents legitimate spikes from being blocked
Example:
Minute 1: 200 requests ✅ (burst)
Minute 2: 200 requests ✅ (burst)
Minute 3: 200 requests ✅ (burst)
Minute 4: 200 requests ✅ (burst)
Minute 5: 200 requests ✅ (burst)
Minute 6: 200 requests ❌ (burst depleted)
Remaining hour: ~17 requests/minute average

Advanced Strategies

Per-User Rate Limiting

Use different limits based on user tiers:
Free Tier Client:
  Limit: 100 requests/hour

Pro Tier Client:
  Limit: 1,000 requests/hour

Enterprise Client:
  Limit: 10,000 requests/hour
Create separate clients for each tier.

Geographic Rate Limiting

Combine with IP whitelisting:
US Region: 5,000 requests/hour
EU Region: 3,000 requests/hour
APAC Region: 2,000 requests/hour
Create region-specific clients with different limits.

Time-Based Rate Limiting

Different limits for peak vs off-peak: Peak Hours (9 AM - 5 PM):
  • Limit: 500 requests/hour
Off-Peak:
  • Limit: 2,000 requests/hour
This requires creating separate routes or using API-based dynamic configuration.

Rate Limit Monitoring

View Rate Limit Events

  1. Navigate to LogsAPI Logs
  2. Filter by status code: 429
  3. See which clients are hitting limits

Set Up Alerts

Get notified when clients hit rate limits:
  1. Navigate to AlertsAdd Alert
  2. Select Rate Limit Exceeded
  3. Configure:
Alert Type: Rate Limit Exceeded
Threshold: 10 violations/hour
Channels: Email, Slack

Analytics Dashboard

Monitor rate limit metrics:
  • Hit rate: % of requests that are rate-limited
  • Top offenders: Clients hitting limits most often
  • Trend analysis: Rate limit violations over time

Best Practices

1. Start Conservative

Begin with strict limits and relax based on usage:
Initial: 100 requests/hour
After monitoring: 500 requests/hour
Production stable: 1,000 requests/hour

2. Use Tiered Limits

Different limits for different client types:
Public API: 100/hour
Partner API: 1,000/hour
Internal Services: 10,000/hour

3. Enable Burst Protection

Allow temporary spikes:
Base: 1,000/hour
Burst: +20% (1,200 total)

4. Monitor and Adjust

  • Check rate limit logs weekly
  • Adjust limits based on legitimate usage
  • Set alerts for unusual patterns

5. Communicate Limits

Document your rate limits for API consumers:
## Rate Limits

- **Free Tier**: 100 requests/hour
- **Pro Tier**: 1,000 requests/hour
- **Enterprise**: Custom limits

Headers included in every response.

Common Configurations

Webhook Endpoint

Limit: 10,000 requests/hour
Burst: 500 requests
Reason: Webhooks can spike during events

Public API

Limit: 100 requests/hour per API key
Burst: 20 requests
Reason: Prevent abuse of public endpoints

Internal Microservices

Limit: 50,000 requests/hour
Burst: 5,000 requests
Reason: High-traffic internal communication

Payment Processing

POST /payments: 10 requests/minute
GET /payments: 100 requests/minute
Reason: Prevent duplicate payment charges

Handling Rate Limits (Client-Side)

Exponential Backoff

When receiving 429, implement retry logic:
async function makeRequestWithRetry(url, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url);

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After');
      const waitTime = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, i) * 1000;

      console.log(`Rate limited. Waiting ${waitTime}ms before retry...`);
      await new Promise(resolve => setTimeout(resolve, waitTime));
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

Check Headers Proactively

const response = await fetch(url);

const limit = response.headers.get('X-RateLimit-Limit');
const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');

if (remaining < 10) {
  console.warn(`Low on rate limit: ${remaining}/${limit} remaining`);
}

Request Queuing

Prevent hitting limits by queuing requests:
class RateLimitedQueue {
  constructor(maxRequestsPerHour) {
    this.maxRequests = maxRequestsPerHour;
    this.queue = [];
    this.requestTimestamps = [];
  }

  async enqueue(requestFn) {
    // Remove timestamps older than 1 hour
    const oneHourAgo = Date.now() - 3600000;
    this.requestTimestamps = this.requestTimestamps.filter(t => t > oneHourAgo);

    // Wait if at limit
    while (this.requestTimestamps.length >= this.maxRequests) {
      const oldestRequest = this.requestTimestamps[0];
      const waitTime = oldestRequest + 3600000 - Date.now();
      await new Promise(resolve => setTimeout(resolve, waitTime));
      this.requestTimestamps.shift();
    }

    this.requestTimestamps.push(Date.now());
    return await requestFn();
  }
}

Troubleshooting

High False Positive Rate

Problem: Legitimate users hitting limits Solutions:
  • Increase burst allowance
  • Raise base limits
  • Use per-user instead of per-IP limits

DDoS Still Getting Through

Problem: Rate limits not preventing attacks Solutions:
  • Lower limits for unknown clients
  • Enable request signing
  • Use IP-based blocking
  • Contact support for enterprise DDoS protection

Inconsistent Limit Enforcement

Problem: Some requests bypass rate limits Check:
  • Rate limits enabled on all routes
  • No conflicting client configurations
  • Limits applied at correct level (route vs client)

Next Steps


📊 Statistics

  • Level: intermediate
  • Time: 15 minutes

🏷️ Tags

rate-limiting, security, ddos, api-protection