Skip to main content

Parse

Parse converts documents into structured, machine-readable blocks and chunks. Submit a parse job, then poll until it completes.

Try it without code

The Developer Console at /developer on your Nomic instance lets you upload a file and inspect the full parse output — blocks, chunks, bounding boxes — directly in your browser. Three pre-loaded examples (Oil & Gas Permits, Scientific Paper, Construction Bid Drawings) are available to get started quickly.

Submit a parse job

POST /parse

Submit a document for parsing. Only files uploaded through the upload endpoint are supported — integration files (SharePoint, Egnyte, etc.) cannot be parsed via the API.

Scope: developer:parse · Rate limit: Heavy (30 req / min)

Request body

FieldTypeRequiredDescription
fileVersionIdstring (uuid)YesThe fileVersionId returned by the upload endpoint.
optionsobjectNoParse options (see below).

Options

FieldTypeDefaultDescription
ocrLanguagestring"en"OCR language. One of en, latin, zh_ja_en.
figureSummary{ enabled: boolean }disabledGenerate text descriptions for detected figures.
tableSummary{ enabled: boolean }disabledGenerate text descriptions for detected tables.
summaryWordLimitnumber (10–100)100Maximum word count for generated summaries.
embedChunksbooleanEmbed chunks after parsing.

Response

{
"taskId": "019abc12-3456-7890-abcd-ef1234567892"
}

Errors

StatusCause
400Invalid body, or the file is an integration file (not an uploaded file).
401Missing or invalid API key.
403API key lacks developer:parse scope.
404File version not found or not accessible.
422Schema validation failed.
501Document parsing unavailable on this instance.

Example

curl -X POST "https://<your-domain>.nomic.ai/api/v0/parse" \
-H "Authorization: Bearer $NOMIC_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"fileVersionId": "019abc12-3456-7890-abcd-ef1234567891",
"options": {
"figureSummary": { "enabled": true },
"tableSummary": { "enabled": true }
}
}'

Get parse task status

GET /parse/{taskId}

Poll the status of a parse task. The resultUrl returned on completion is a presigned URL valid for 1 hour.

Scope: developer:parse · Rate limit: Standard (300 req / min)

Path parameters

ParameterTypeDescription
taskIdstring (uuid)The taskId returned by the submit endpoint.

Response

The response shape depends on the task status:

Pending:

{
"status": "pending"
}

Completed:

{
"status": "completed",
"resultUrl": "https://..."
}

The resultUrl points to the parsed output JSON. Download it before the URL expires (1 hour).

Failed:

{
"status": "failed",
"error": "Description of what went wrong"
}

Errors

StatusCause
401Missing or invalid API key.
403API key lacks developer:parse scope.
404Task not found or not accessible.
501Document parsing unavailable on this instance.

Example

curl "https://<your-domain>.nomic.ai/api/v0/parse/019abc12-3456-7890-abcd-ef1234567892" \
-H "Authorization: Bearer $NOMIC_API_KEY"

End-to-end example

Upload a file, parse it, and download the result:

# 1. Upload
UPLOAD=$(curl -s -X POST "https://<your-domain>.nomic.ai/api/v0/files/upload" \
-H "Authorization: Bearer $NOMIC_API_KEY" \
-F "file=@document.pdf")

FILE_VERSION_ID=$(echo "$UPLOAD" | jq -r '.fileVersionId')

# 2. Submit parse job
PARSE=$(curl -s -X POST "https://<your-domain>.nomic.ai/api/v0/parse" \
-H "Authorization: Bearer $NOMIC_API_KEY" \
-H "Content-Type: application/json" \
-d "{\"fileVersionId\": \"$FILE_VERSION_ID\"}")

TASK_ID=$(echo "$PARSE" | jq -r '.taskId')

# 3. Poll until complete
while true; do
STATUS=$(curl -s "https://<your-domain>.nomic.ai/api/v0/parse/$TASK_ID" \
-H "Authorization: Bearer $NOMIC_API_KEY")

STATE=$(echo "$STATUS" | jq -r '.status')

if [ "$STATE" = "completed" ]; then
RESULT_URL=$(echo "$STATUS" | jq -r '.resultUrl')
curl -o result.json "$RESULT_URL"
echo "Parse complete — saved to result.json"
break
elif [ "$STATE" = "failed" ]; then
echo "Parse failed: $(echo "$STATUS" | jq -r '.error')"
exit 1
fi

sleep 5
done