Rate Limits
Limits by plan (Starter/Pro/Enterprise), rate limit headers, handling 429 responses.
NFYio enforces rate limits to ensure fair usage and system stability. Limits vary by plan. When exceeded, the API returns 429 Too Many Requests. Use the response headers to implement backoff and retry logic.
Limits by Plan
Starter
| Resource | Limit | Window |
|---|---|---|
| API requests | 1,000 requests | per minute |
| Storage API | 500 requests | per minute |
| Agent chat | 100 messages | per minute |
| Agent queries | 50 queries | per minute |
| Buckets | 10 buckets | total |
| Objects | 10,000 objects | total |
Pro
| Resource | Limit | Window |
|---|---|---|
| API requests | 10,000 requests | per minute |
| Storage API | 5,000 requests | per minute |
| Agent chat | 500 messages | per minute |
| Agent queries | 200 queries | per minute |
| Buckets | 100 buckets | total |
| Objects | 1,000,000 objects | total |
Enterprise
| Resource | Limit | Window |
|---|---|---|
| API requests | Custom (contact sales) | per minute |
| Storage API | Custom | per minute |
| Agent chat | Custom | per minute |
| Agent queries | Custom | per minute |
| Buckets | Unlimited | — |
| Objects | Unlimited | — |
Enterprise plans can configure custom limits per organization.
Rate Limit Headers
Every API response includes headers indicating current usage:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Retry-After | Seconds to wait before retry (on 429) |
Example Response Headers
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1709304660
When you hit the limit:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1709304720
Retry-After: 60
{
"error": {
"code": "RateLimitExceeded",
"message": "Rate limit exceeded. Retry after 60 seconds.",
"details": {
"retry_after": 60,
"limit": 1000,
"window": "1m"
}
}
}
Handling 429 Responses
1. Respect Retry-After
The Retry-After header (or details.retry_after in the JSON body) tells you how long to wait:
async function fetchWithRateLimitHandling(url, options) {
const response = await fetch(url, options);
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After') || 60;
console.warn(`Rate limited. Retrying after ${retryAfter}s`);
await new Promise((r) => setTimeout(r, retryAfter * 1000));
return fetchWithRateLimitHandling(url, options); // Retry
}
return response;
}
2. Proactive Throttling
Track X-RateLimit-Remaining and slow down before hitting the limit:
let remaining = 1000;
async function throttledFetch(url, options) {
if (remaining < 10) {
const reset = parseInt(response.headers.get('X-RateLimit-Reset'), 10);
const waitMs = (reset * 1000) - Date.now();
await new Promise((r) => setTimeout(r, Math.max(0, waitMs)));
}
const response = await fetch(url, options);
remaining = parseInt(response.headers.get('X-RateLimit-Remaining'), 10) ?? remaining;
return response;
}
3. Batch Requests
Reduce request count by batching where possible:
// Instead of 100 separate GETs
const objects = await Promise.all(
keys.map((key) => s3.getObject({ Bucket, Key: key }))
);
// Use ListObjects with prefix, or batch APIs if available
const { Contents } = await s3.listObjectsV2({ Bucket, Prefix: 'folder/' });
Limits by Endpoint Type
Some endpoints have stricter limits:
| Endpoint Type | Starter | Pro | Enterprise |
|---|---|---|---|
| Auth (login/token) | 20/min | 100/min | Custom |
| Embedding trigger | 5/min | 20/min | Custom |
| VPC operations | 50/min | 200/min | Custom |
Best Practices
- Cache responses — Reduce redundant API calls with client-side caching
- Use webhooks — Prefer webhooks over polling where supported
- Implement backoff — Always respect
Retry-Afteron 429 - Monitor usage — Track
X-RateLimit-Remainingin logs or dashboards - Upgrade if needed — Pro/Enterprise plans offer higher limits
Next Steps
- Error Handling — Retry strategies and error codes
- API Authentication — Auth and scopes
- Storage API Reference — Optimizing storage requests