Rate Limits
To ensure consistent service quality, the Nomic Platform API enforces rate limits per user account. All API keys owned by the same user share a single quota. These limits help maintain platform stability for all users.
Platform API (v0)
All v0 endpoints are rate-limited per user. Limits are enforced per tier — all endpoints in the same tier share a single bucket. Requests are paced evenly over time, so clients that respect the Retry-After header will naturally spread their traffic at a steady rate.
| Tier | Limit | Applies to |
|---|---|---|
| Standard | 300 req/min | Read endpoints (list/get users, audit logs, API keys, parse status) |
| Write | 60 req/min | Mutation endpoints (disable/enable user, revoke API key) |
| Heavy | 30 req/min | Resource-intensive endpoints (file upload, submit parse) |
| Search | 60 req/min | Search endpoints (codes search) |
Rate Limit Headers
Every API response includes the following headers so you can track your usage:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed per window |
X-RateLimit-Remaining | Approximate remaining requests before throttling |
X-RateLimit-Reset | Unix timestamp (seconds) when the quota resets |
When a limit is exceeded, the API returns 429 Too Many Requests along with a Retry-After header indicating the exact number of seconds until your next request will be accepted. Clients that sleep for this duration and then retry will be automatically paced at a steady rate.
Embedding Inference API
We enforce a rate limit of 1200 requests per 5-minute rolling window per IP address for the Embedding Inference API.
If your use case requires higher throughput than our standard rate limits allow, we offer two recommended solutions:
Atlas Datasets
You can upload your dataset into Atlas and then download the embeddings we generate.
The Atlas Platform prioritizes embedding generation for Atlas Datasets and does not apply rate limits. This solution offers you an efficient way to generate embeddings at scale, coupled with the ability to visualize and explore your data in the Atlas UI in your browser.
You can get started generating embeddings for your data via Atlas by visiting our Upload a Dataset guide.
Amazon SageMaker
Our text embedding and image models are available for inference on Amazon SageMaker, giving you ability to configure settings for throughput and processing power. This solution is well-suited for teams already working within the AWS ecosystem who need high-volume embedding generation.
You can get started generating text and image embeddings via Amazon SageMaker with these example notebooks from our GitHub repository.