LEAPERone Docs

Rate Limits

Understanding usage tiers and rate limits.

LEAPERone uses a tier-based rate limiting system, similar to OpenAI. Your tier determines the maximum requests per minute (RPM) for each endpoint.

Usage Tiers

Your tier is determined by cumulative spend and account age. As you use the API more, you automatically qualify for higher tiers.

TierQualificationChat RPMImage RPMAudio RPMVideo RPM
0 (free)Default5223
1 (starter)$5 spent205510
2 (standard)$50 spent + 7 days40101020
3 (pro)$100 spent + 7 days60201030
4 (business)$250 spent + 14 days100302050
5 (enterprise)$1000 spent + 30 days2006030100

"Days" refers to days since your first payment, not account creation date. Tier upgrades are automatic and never downgrade.

You can view your current tier and limits in your dashboard.

Custom API Key Limits

You can set a custom RPM limit on individual API keys that is lower than your tier's maximum. This is useful for controlling usage across different applications.

Set this in API Key settings by clicking the RPM badge on any key.

Rate Limit Headers

When a request is rate-limited, the 429 response includes these headers:

HeaderDescription
X-RateLimit-LimitMaximum number of requests allowed in the current window.
X-RateLimit-RemainingNumber of requests remaining in the current window.
X-RateLimit-ResetUnix timestamp (seconds) when the current window resets.
Retry-AfterSeconds until you can retry (only present on 429 responses).

Handling 429 Responses

When you exceed the rate limit, the API returns a 429 status:

429 Response
{
  "error": {
    "message": "Rate limit exceeded. Please slow down.",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

Use the Retry-After header to determine how long to wait before sending the next request.

Exponential Backoff

The recommended strategy for handling rate limits is exponential backoff with jitter:

Rate-limit-aware fetch
async function fetchWithBackoff(url, options, maxRetries = 5) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.ok) return response;

    if (response.status === 429) {
      const retryAfter = response.headers.get("Retry-After");
      const waitMs = retryAfter
        ? parseInt(retryAfter, 10) * 1000
        : Math.min(1000 * 2 ** attempt, 30000);
      const jitter = Math.random() * 1000;

      console.warn(
        `Rate limited. Retrying in ${Math.round((waitMs + jitter) / 1000)}s...`
      );
      await new Promise((r) => setTimeout(r, waitMs + jitter));
      continue;
    }

    // Non-retryable error
    throw new Error(`Request failed with status ${response.status}`);
  }

  throw new Error("Max retries exceeded");
}

Best Practices

  • Monitor the headers. Check X-RateLimit-Remaining proactively and slow down before hitting zero.
  • Queue requests. If you are making many calls in a batch, use a queue to space them out evenly across the window.
  • Use jitter. Always add random jitter to backoff delays to prevent synchronized retries from multiple clients.
  • Cache responses. If you call the same endpoint with the same parameters, cache the result to avoid unnecessary requests.