Create embeddings

Create embeddings

curl --request POST \
  --url https://api.studio.nebius.com/v1/embeddings \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "BAAI/bge-en-icl",
  "input": "What'\''s a nice vector, Victor?",
  "encoding_format": "<string>",
  "user": "<string>",
  "service_tier": "auto"
}'

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        0.0023064255,
        -0.009327292,
        -0.0028842222
      ],
      "index": 0
    }
  ],
  "model": "BAAI/bge-en-icl",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

POST

embeddings

Create embeddings

curl --request POST \
  --url https://api.studio.nebius.com/v1/embeddings \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "BAAI/bge-en-icl",
  "input": "What'\''s a nice vector, Victor?",
  "encoding_format": "<string>",
  "user": "<string>",
  "service_tier": "auto"
}'

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [
        0.0023064255,
        -0.009327292,
        -0.0028842222
      ],
      "index": 0
    }
  ],
  "model": "BAAI/bge-en-icl",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Query Parameters

ai_project_id

string | null

current project ID

Body

application/json

model

string

required

ID of the model to use.

Examples:

"BAAI/bge-en-icl"

input

required

Input text to embed, encoded as a string or array of tokens.

Examples:

"What's a nice vector, Victor?"

encoding_format

string | null

default:float

The format to return the embeddings in. Can be either float or base64.

user

string | null

A unique identifier representing your end-user.

service_tier

enum<string> | null

default:auto

The service tier to use for the request. Represents the service tier for requests.

Attributes: Auto: Automatically choose the best available tier for the request (Default or OverLimit). Analyze response to determine which tier was used. Default: Return 429 errors on hitting the rate limit, do not exceed to the OverLimit tier. OverLimit: Indicate that the request was over the user limit. This tier cannot be set by user in the request, but us used in a response for tier=Auto. Flex: Do not consume rate-limit credits, but run with lower priority. May still result in 429 errors in case of if there is no resources to process.

Available options:

auto,

default,

over-limit,

flex

Examples:

"auto"

"flex"

Response

object

string

required

always 'list'.

model

string

required

The model used for the embedding.

usage

object

required

Token usage stats.

Show child attributes

data

Embedding · object[]

required

List of Embedding objects

Show child attributes

service_tier

enum<string>

required

The service tier used for the request. Represents the service tier for requests.

Available options:

auto,

default,

over-limit,

flex

Create chat completion Generate

⌘I

API Documentation

Endpoints

Authorizations

Query Parameters

Body

Response