Rate Limits
Understanding usage tiers and rate limits.
LEAPERone uses a tier-based rate limiting system, similar to OpenAI. Your tier determines the maximum requests per minute (RPM) for each endpoint.
Usage Tiers
Your tier is determined by cumulative spend and account age. As you use the API more, you automatically qualify for higher tiers.
| Tier | Qualification | Chat RPM | Image RPM | Audio RPM | Video RPM |
|---|---|---|---|---|---|
| 0 (free) | Default | 5 | 2 | 2 | 3 |
| 1 (starter) | $5 spent | 20 | 5 | 5 | 10 |
| 2 (standard) | $50 spent + 7 days | 40 | 10 | 10 | 20 |
| 3 (pro) | $100 spent + 7 days | 60 | 20 | 10 | 30 |
| 4 (business) | $250 spent + 14 days | 100 | 30 | 20 | 50 |
| 5 (enterprise) | $1000 spent + 30 days | 200 | 60 | 30 | 100 |
"Days" refers to days since your first payment, not account creation date. Tier upgrades are automatic and never downgrade.
You can view your current tier and limits in your dashboard.
Custom API Key Limits
You can set a custom RPM limit on individual API keys that is lower than your tier's maximum. This is useful for controlling usage across different applications.
Set this in API Key settings by clicking the RPM badge on any key.
Rate Limit Headers
When a request is rate-limited, the 429 response includes these headers:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum number of requests allowed in the current window. |
X-RateLimit-Remaining | Number of requests remaining in the current window. |
X-RateLimit-Reset | Unix timestamp (seconds) when the current window resets. |
Retry-After | Seconds until you can retry (only present on 429 responses). |
Handling 429 Responses
When you exceed the rate limit, the API returns a 429 status:
{
"error": {
"message": "Rate limit exceeded. Please slow down.",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}Use the Retry-After header to determine how long to wait before sending the next request.
Exponential Backoff
The recommended strategy for handling rate limits is exponential backoff with jitter:
async function fetchWithBackoff(url, options, maxRetries = 5) {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
const response = await fetch(url, options);
if (response.ok) return response;
if (response.status === 429) {
const retryAfter = response.headers.get("Retry-After");
const waitMs = retryAfter
? parseInt(retryAfter, 10) * 1000
: Math.min(1000 * 2 ** attempt, 30000);
const jitter = Math.random() * 1000;
console.warn(
`Rate limited. Retrying in ${Math.round((waitMs + jitter) / 1000)}s...`
);
await new Promise((r) => setTimeout(r, waitMs + jitter));
continue;
}
// Non-retryable error
throw new Error(`Request failed with status ${response.status}`);
}
throw new Error("Max retries exceeded");
}Best Practices
- Monitor the headers. Check
X-RateLimit-Remainingproactively and slow down before hitting zero. - Queue requests. If you are making many calls in a batch, use a queue to space them out evenly across the window.
- Use jitter. Always add random jitter to backoff delays to prevent synchronized retries from multiple clients.
- Cache responses. If you call the same endpoint with the same parameters, cache the result to avoid unnecessary requests.