Data upload
See the API Reference for details and additional options.
Creating an embedding dataset
The following minimal example allows you to interact with your embeddings dataset in Atlas.
from nomic import atlas
import numpy as np
num_embeddings = 10000
embeddings = np.random.rand(num_embeddings, 512)
dataset = atlas.map_data(embeddings=embeddings)
print(dataset)
This dataset will contain 10,000 random embeddings. You can interact with it in Nomic Atlas as organized by the nomic-project-v1 model by navigating to the browser link.
Creating a text dataset
from nomic import atlas
import pandas
news_articles = pandas.read_csv('https://raw.githubusercontent.com/nomic-ai/maps/main/data/ag_news_25k.csv')
dataset = atlas.map_data(data=news_articles, indexed_field='text')
print(dataset)
This dataset will contain 25,000 news articles embedded with the default Nomic Embed Text model.
Creating an image dataset
from nomic import atlas
from datasets import load_dataset
dataset = load_dataset('cifar10', split="train")
images = dataset["img"]
data = [{"label": label} for label in dataset["label"]]
dataset = atlas.map_data(blobs=images, data=data)
This dataset will map the CIFAR10 dataset with the default Nomic Embed Vision model.