Skip to main content

Embeddings

The embed module provides embedding functionality using the Nomic Embedding API.

Text Embeddings

embed.text generates embeddings for a list of texts.

from nomic import embed

output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
)

print(output)

Output:

{'embeddings': [
[0.008766174, 0.014785767, -0.13134766, ...],
[0.017822266, 0.018585205, -0.12683105, ...]],
'inference_mode': 'remote',
'model': 'nomic-embed-text-v1.5',
'usage': {'prompt_tokens': 10, 'total_tokens': 10}}

Resizable Dimensionality

Some models support a resizable output dimension for a small trade-off in performance.

Specify a dimensionality to control the embedding size.

output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
dimensionality=512,
)

Local Inference

Use inference_mode='local' to generate embeddings directly on the local machine with GPT4All.

output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
inference_mode='local',
)

Dynamic Local Inference

Use inference_mode='dynamic' to automatically select an inference mode based on the size of the input. Larger inputs will be sent to the Nomic Embedding API.

output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
inference_mode='dynamic',
)

Selecting a Device

The device parameter of embed.text allows you to use a different device for local inference. By default, the CPU is used on Windows and Linux, while the Metal backend is used on Apple Silicon. See the GPT4All documentation for more information.

output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
inference_mode='local',
device='gpu',
)

API Reference

embed.text

def text(texts: list[str],
*,
model: str = "nomic-embed-text-v1",
task_type: str = "search_document",
dimensionality: int | None = None,
long_text_mode: str = "truncate",
inference_mode: str = "remote",
device: str | None = None,
**kwargs: Any) -> dict[str, Any]

Generates embeddings for the given text.

Arguments:

  • texts - The text to embed.
  • model - The model to use when embedding.
  • task_type - The task type to use when embedding. One of search_query, search_document, classification, clustering.
  • dimensionality - The embedding dimension, for use with Matryoshka-capable models. Defaults to full-size.
  • long_text_mode - How to handle texts longer than the model can accept. One of mean or truncate.
  • inference_mode - How to generate embeddings. One of remote, local (Embed4All), or dynamic (automatic). Defaults to remote.
  • device - The device to use for local embeddings. Defaults to CPU, or Metal on Apple Silicon. It can be set to:
    • "gpu": Use the best available GPU.
    • "amd", "nvidia": Use the best available GPU from the specified vendor.
    • A specific device name from the output of GPT4All.list_gpus()
  • kwargs - Remaining arguments are passed to the Embed4All contructor.

Returns:

A dict containing your embeddings and request metadata

embed.free_embedding_model

def free_embedding_model() -> None

Free the current Embed4All instance and its associated system resources.