Skip to main content

Embeddings and Retrieval

Nomic Atlas provides embedding and retrieval capabilities that power semantic search, RAG applications, clustering analysis, and multimodal data processing to help you work with unstructured data.

What are Embeddings?

Embeddings are dense vector representations of data (like text or images) that capture semantic meaning in a way that computers can process.

Embeddings in Atlas

When an unstructured dataset is uploaded to Atlas, an embedding is associated with each datapoint using a Nomic Embedding Model.

In Atlas, embeddings serve two key purposes:

  1. Data Maps: Embeddings determine the layout of Atlas maps, as detailed in our technical report. This projection preserves semantic relationships, ensuring similar items appear closer together in the map interface.

  2. Vector Search: The vector search bar in Atlas uses embeddings to find semantically related content, going beyond simple keyword matching to find data on the map that corresponds with the meaning of your queries.

In addition to underlying the capabilities of Atlas, embeddings are available for inference using the Nomic API.

2D Embeddings: All embeddings stored in Atlas have a corresponding 2D, human-interpretable representation. These 2D embeddings power the layout of the Atlas Map. They are generated with a Nomic Dimensionality Reduction model.

Getting Started

The simplest way to start using embeddings in Atlas is through our Python client:

from nomic import embed

# Generate text embeddings
text_output = embed.text(
texts=['Your text here'],
model='nomic-embed-text-v1.5',
task_type='search_document'
)

# Generate image embeddings
image_output = embed.image(
images=['path/to/image.jpg'],
model='nomic-embed-vision-v1.5'
)

Supported Tasks

  • Vector Search: Build efficient semantic search systems
  • RAG Applications: Power retrieval for AI applications
  • Clustering: Organize and understand data relationships
  • Classification: Train models for categorization tasks

Multimodal Support

Atlas supports both text and image embeddings through state-of-the-art models:

  • Text embeddings via nomic-embed-text-v1.5
  • Image embeddings via nomic-embed-vision-v1.5

Guides

Next Steps