Skip to main content

Models

The Assistant model picker lets you choose which foundation model drives a new Assistant session. By default, Assistant uses Auto, which lets Nomic pick the current default model in the backend for that session. Nomic currently supports five concrete models. All prices below are passed through at the provider's API rate and are quoted per 1M tokens.

Picking a model

  • Auto — Recommended default. Nomic picks the current backend default model for each new session.
  • Sonnet 4.6 — Best balance of speed, cost, and quality for daily Assistant use and most workflows.
  • Opus 4.8 — Use for the hardest reasoning, ambiguous specs, and high-stakes reviews. Roughly 1.7× Sonnet's cost.
  • Haiku 4.5 — Use for quick lookups, exploration, and high-volume automation. Roughly 3× cheaper output than Sonnet.
  • Gemini 3.5 Flash — Use for fast Google-based agentic and coding tasks with strong native thinking.
  • Gemini 3.1 Pro — Use when you need very large context windows, or when your team prefers a Google-based inference path.

The foundation model is fixed for the entire Assistant session — you cannot switch models after the session has started. To use a different model, start a new Assistant session. Whichever model you pick, Assistant uses the same tools, citations, and file context.

Pricing

ModelProviderInputOutputCache writeCache read
Haiku 4.5Anthropic$1.00$5.00$1.25$0.10
Sonnet 4.6Anthropic$3.00$15.00$3.75$0.30
Opus 4.8Anthropic$5.00$25.00$6.25$0.50
Gemini 3.5 FlashGoogle$1.50$9.00$0.15
Gemini 3.1 ProGoogle$2.00$12.00$0.20

Prices are USD per 1M tokens. Anthropic models charge separately for cache writes; Gemini does not. On later turns in the same Assistant session, cache reads are typically 10× cheaper than fresh input — Assistant takes advantage of this automatically as the conversation continues.