Skip to main content

Embeddings

The embed module provides embedding functionality using the Nomic Embedding API.

See the embeddings user guide for more information on usage & capabilities.

Text Embeddings

embed.text generates embeddings for a list of texts.

from nomic import embed

output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
)

print(output)

Output:

{'embeddings': [
[0.008766174, 0.014785767, -0.13134766, ...],
[0.017822266, 0.018585205, -0.12683105, ...]],
'inference_mode': 'remote',
'model': 'nomic-embed-text-v1.5',
'usage': {'prompt_tokens': 10, 'total_tokens': 10}}

Image Embeddings

embed.image generates embeddings for a list of images.

from nomic import embed

output = embed.image(
images=['/path/to/image1.jpg', '/path/to/image2.jpg'],
model='nomic-embed-vision-v1.5',
)
print(output)

Output:

{'embeddings': [
[0.008766174, 0.014785767, -0.13134766, ...],
[0.017822266, 0.018585205, -0.12683105, ...]],
'model': 'nomic-embed-vision-v1.5',
'usage': {'prompt_tokens': 10, 'total_tokens': 10}}

API Reference

embed.text

def text(texts: list[str],
*,
model: str = "nomic-embed-text-v1",
task_type: str = "search_document",
dimensionality: int | None = None,
long_text_mode: str = "truncate",
inference_mode: str = "remote",
device: str | None = None,
**kwargs: Any) -> dict[str, Any]

Generates embeddings for the given text.

Arguments:

  • texts - The text to embed.
  • model - The model to use when embedding.
  • task_type - The task type to use when embedding. One of search_query, search_document, classification, clustering.
  • dimensionality - The embedding dimension, for use with Matryoshka-capable models. Defaults to full-size.
  • long_text_mode - How to handle texts longer than the model can accept. One of mean or truncate.
  • inference_mode - How to generate embeddings. One of remote, local (Embed4All), or dynamic (automatic). Defaults to remote.
  • device - The device to use for local embeddings. Defaults to CPU, or Metal on Apple Silicon. It can be set to:
    • "gpu": Use the best available GPU.
    • "amd", "nvidia": Use the best available GPU from the specified vendor.
    • A specific device name from the output of GPT4All.list_gpus()
  • kwargs - Remaining arguments are passed to the Embed4All contructor.

Returns:

A dict containing your embeddings and request metadata

embed.image

def image(images: Sequence[Union[str, PIL.Image.Image]],
model: str = "nomic-embed-vision-v1.5")

Generates embeddings for the given images.

Arguments:

  • images - the images to embed. Can be file paths to images, image-file bytes or Pillow objects
  • model - the model to use when embedding

Returns:

An object containing your embeddings and request metadata

embed.free_embedding_model

def free_embedding_model() -> None

Free the current Embed4All instance and its associated system resources.