Types of rate limits
There are three independent rate limits, all enforced on generation requests only:| Limit | Applies to | Window | Description |
|---|---|---|---|
| Image RPM | POST /v2/images/generations | 60 seconds | Maximum image generation submissions per minute |
| Video RPM | POST /v2/videos/generations | 60 seconds | Maximum video generation submissions per minute |
| Concurrent requests | Both generation endpoints | Rolling | Maximum in-flight generation requests at any time |
Your current rate limits are shown in the developer dashboard. Limits vary by account setup.
How this affects image and video generation
- Image generations count when you submit the generation.
- Video generations count when you submit the generation.
- Polling
GET /v2/videos/generations/{id}is a read operation. It does not create a new generation. - Pending video generations can still count against your concurrent limit until they finish.
Rate limit headers
Every generation response includes headers so you can track your usage:| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp in seconds when the window resets |
HTTP 429 with an additional header:
| Header | Description |
|---|---|
Retry-After | Seconds to wait before retrying |
Concurrent request limit
The concurrent limit caps in-flight generation requests across both endpoints. Pending video generations count against this limit. When exceeded, the API returns429:
Handling rate limits
Use theRetry-After header with exponential backoff.