Embeddings
The embed
module provides embedding functionality using the Nomic Embedding API.
Text Embeddings
embed.text
generates embeddings for a list of texts.
from nomic import embed
output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
)
print(output)
Output:
{'embeddings': [
[0.008766174, 0.014785767, -0.13134766, ...],
[0.017822266, 0.018585205, -0.12683105, ...]],
'inference_mode': 'remote',
'model': 'nomic-embed-text-v1.5',
'usage': {'prompt_tokens': 10, 'total_tokens': 10}}
Resizable Dimensionality
Some models support a resizable output dimension for a small trade-off in performance.
Specify a dimensionality
to control the embedding size.
output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
dimensionality=512,
)
Local Inference
Use inference_mode='local'
to generate embeddings directly on the local machine with GPT4All.
output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
inference_mode='local',
)
Dynamic Local Inference
Use inference_mode='dynamic'
to automatically select an inference mode based on the size of the input. Larger inputs will be sent to the Nomic Embedding API.
output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
inference_mode='dynamic',
)
Selecting a Device
The device
parameter of embed.text
allows you to use a different device for local inference. By default, the CPU is used on Windows and Linux, while the Metal backend is used on Apple Silicon. See the GPT4All documentation for more information.
output = embed.text(
texts=['Nomic Embedding API', '#keepAIOpen'],
model='nomic-embed-text-v1.5',
task_type='search_document',
inference_mode='local',
device='gpu',
)
API Reference
embed.text
def text(texts: list[str],
*,
model: str = "nomic-embed-text-v1",
task_type: str = "search_document",
dimensionality: int | None = None,
long_text_mode: str = "truncate",
inference_mode: str = "remote",
device: str | None = None,
**kwargs: Any) -> dict[str, Any]
Generates embeddings for the given text.
Arguments:
texts
- The text to embed.model
- The model to use when embedding.task_type
- The task type to use when embedding. One ofsearch_query
,search_document
,classification
,clustering
.dimensionality
- The embedding dimension, for use with Matryoshka-capable models. Defaults to full-size.long_text_mode
- How to handle texts longer than the model can accept. One ofmean
ortruncate
.inference_mode
- How to generate embeddings. One ofremote
,local
(Embed4All), ordynamic
(automatic). Defaults toremote
.device
- The device to use for local embeddings. Defaults to CPU, or Metal on Apple Silicon. It can be set to:- "gpu": Use the best available GPU.
- "amd", "nvidia": Use the best available GPU from the specified vendor.
- A specific device name from the output of
GPT4All.list_gpus()
kwargs
- Remaining arguments are passed to the Embed4All contructor.
Returns:
A dict containing your embeddings and request metadata
embed.free_embedding_model
def free_embedding_model() -> None
Free the current Embed4All instance and its associated system resources.