Skip to main content

Embeddings

Nomic Atlas allows anyone to access the power of embeddings.

An embedding is a vector representation of an unstructured datapoint that enables computers to manipulate the data based on semantics and meaning.

Learn more about Nomic Embed Text, Nomic Embed Vision, and the Embedding Inference API.

New Release: Nomic Embedding Vision and API

We've launched Nomic Embed Vision, a vision model aligned to Nomic Embed Text! All existing Nomic Embed Text embeddings are now multimodel; Nomic Embed Text embeddings can be used query the new Nomic Embed Vision embeddings out of the box, and visa versa. Together, Nomic Embed Text and Nomic Embed Vision project data into the only unified embedding space that achieves state of the art performance on vision, language, and multimodal tasks.

You can use it as the image embedding model powering your AtlasDataset and it is available in the Nomic Embedding API.

Read more in our official blog post and learn how to use it in the API Reference.

Embeddings in Atlas

When an unstructured dataset is uploaded to Atlas, an embedding is associated with each datapoint using a Nomic Embedding Model.

Nomic Atlas operates over embeddings to enable its unstructured data capabilities.

2D Embeddings: All embeddings stored in Atlas have a corresponding 2D, human-interpretable representation. These 2D embeddings power the layout of the Atlas Map. They are generated with a Nomic Dimensionality Reduction model.

Accessing embeddings

You can use the Nomic Python client to access and download low-dimensional (2D) and high-dimensional embeddings of your dataset.

  • Low-dimensional (2-D): These are the embeddings used to visualize your datasets in the Atlas Map.
  • High-dimensional (latent): These are produced by Nomic Embedding Models or are your uploaded embeddings.

Your datasets embeddings exist in the map.embeddings attribute of the AtlasDataset:

from nomic import AtlasDataset

map = AtlasDataset('my-dataset').maps[0]

map.embeddings

Latent and 2D embeddings

The map.embeddings.projected attribute contains a Pandas dataframe of your 2D embeddings.

The map.embeddings.latent contains high-dimensional embeddings produced by a Nomic Embedding Model.

# 2D
projected_embeddings = map.embeddings.projected

# Latent high dimensional
latent_embeddings = map.embeddings.latent