Embeddings
The embed
module provides embedding functionality using the Nomic Embedding API.
See the embeddings user guide for more information on usage & capabilities.
Text Embeddings
embed.text
generates embeddings for a list of texts.
from nomic import embed
output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
)
print(output)
Output:
{'embeddings': [
[0.008766174, 0.014785767, -0.13134766, ...],
[0.017822266, 0.018585205, -0.12683105, ...]],
'inference_mode': 'remote',
'model': 'nomic-embed-text-v1.5',
'usage': {'prompt_tokens': 10, 'total_tokens': 10}}
Image Embeddings
embed.image
generates embeddings for a list of images.
from nomic import embed
output = embed.image(
images=['/path/to/image1.jpg', '/path/to/image2.jpg'],
model='nomic-embed-vision-v1.5',
)
print(output)
Output:
{'embeddings': [
[0.008766174, 0.014785767, -0.13134766, ...],
[0.017822266, 0.018585205, -0.12683105, ...]],
'model': 'nomic-embed-vision-v1.5',
'usage': {'prompt_tokens': 10, 'total_tokens': 10}}
API Reference
embed.text
def text(texts: list[str],
*,
model: str = "nomic-embed-text-v1",
task_type: str = "search_document",
dimensionality: int | None = None,
long_text_mode: str = "truncate",
inference_mode: str = "remote",
device: str | None = None,
**kwargs: Any) -> dict[str, Any]
Generates embeddings for the given text.
Arguments:
texts
- The text to embed.model
- The model to use when embedding.task_type
- The task type to use when embedding. One ofsearch_query
,search_document
,classification
,clustering
.dimensionality
- The embedding dimension, for use with Matryoshka-capable models. Defaults to full-size.long_text_mode
- How to handle texts longer than the model can accept. One ofmean
ortruncate
.inference_mode
- How to generate embeddings. One ofremote
,local
(Embed4All), ordynamic
(automatic). Defaults toremote
.device
- The device to use for local embeddings. Defaults to CPU, or Metal on Apple Silicon. It can be set to:- "gpu": Use the best available GPU.
- "amd", "nvidia": Use the best available GPU from the specified vendor.
- A specific device name from the output of
GPT4All.list_gpus()
kwargs
- Remaining arguments are passed to the Embed4All contructor.
Returns:
A dict containing your embeddings and request metadata
embed.image
def image(images: Sequence[Union[str, PIL.Image.Image]],
model: str = "nomic-embed-vision-v1.5")
Generates embeddings for the given images.
Arguments:
images
- the images to embed. Can be file paths to images, image-file bytes or Pillow objectsmodel
- the model to use when embedding
Returns:
An object containing your embeddings and request metadata
embed.free_embedding_model
def free_embedding_model() -> None
Free the current Embed4All instance and its associated system resources.