Skip to main content

Parse files into structured information

POST 

/v1/parse

Parse a file into structured information.

Supports PDF files with configurable chunking strategies and optional embedding generation.

Request

Body

required

    file_url File Url (string)

    List of file URLs to process. Supports two URL types:

        1. Public URLs - accessible from the internet

    2. `nomic://` prefixed URLs - obtained from the `/upload` endpoint
    result_url Result Url (string)

    Custom storage URL where parse results will be uploaded via PUT request. If not provided, results are stored by us temporarily and accessible via the task status endpoint.

    result_put_headers

    object

    HTTP headers to include when uploading parse results to the result_url.

    property name* string

    options

    object

    chunker ChunkerType (string)

    Possible values: [hybrid, hierarchical]

    Default value: hybrid

    Chunking strategy: hybrid splits documents using both content and structure, hierarchical splits at natural document sections like headings and chapters.

    embed_chunks Embed Chunks (boolean)

    When enabled, generates vector embeddings for each chunk after parsing is complete. Embedding generation runs as a separate background job and won't slow down the main parsing process.

    file_id File Id (string)

    The id of an uploaded file to parse. (deprecated)

Responses

The task id of the parsing task.

Schema

    task_id Task Id (string)required

    The id of the task.

    options_ids

    object

    The id of the options used for the parsing task (e.g., embed_chunks).

    property name* string
Loading...