- Your own custom models. In this case, prepare an archive with model files or a link to these model files hosted on Hugging Face.
- Models fine-tuned in Nebius AI Studio. In this case, you deploy results of fine-tuning.
Prerequisites
If you want to use a Python script or cURL commands, meet the requirements below. If you want to work in the web interface, no prerequisites are needed.- Python
- cURL
- Create an API key for authentication.
-
Save the API key to an environment variable:
-
Install the
openaipackage:
How to deploy a custom model
If you have your own custom model that is not being hosted, you can deploy it in Nebius AI Studio:- Python
- UI
Run the following Python script:The script stages are the following:
-
Uploads an archive with the LoRA adapter weights and configuration (
adapter_model.safetensorsandadapter_config.json). Alternatively, you can use a link to a Hugging Face repository with the LoRA adapter model files. In this case, you don’t need to call theupload_filemethod. -
Deploys the custom model. In the
create_lora_from_filemethod, specify the following:- LoRA adapter name, for example,
test-adapter. - Hugging Face link or the ID of the uploaded archive.
- Base model name. Select it from the list of available models.
- LoRA adapter name, for example,
-
Waits for the custom model to be validated.
The model first receives the
validatingstatus. When the model is validated, the status changes toactive. If the uploaded data is invalid, the status changes toerrorand thestatus_reasoncontains the error message. -
Gets the custom model name.
The name of a deployed model is composed of the following components:
- Base model name.
- LoRA adapter name.
- Random suffix that is used to make the model name unique.
meta-llama/Llama-3.1-8B-Instructbase model and thetest-adaptername for a LoRA adapter, the name of the custom model ismeta-llama/Llama-3.1-8B-Instruct-LoRa:test-adapter-AbCd. - Creates a multi-message request by using the custom model name.
- In the web interface, go to the Models section and switch to the Custom tab.
- Click Deploy your LoRA.
- In the window that opens, specify the deployment settings:
- Select a base model.
- Enter a LoRA adapter name.
- If you have uploaded the LoRA adapter model to Hugging Face, select Add by link and enter the link to the model.
- If you have an archive with the LoRA adapter model weights and configuration (
adapter_model.safetensorsandadapter_config.json), select Upload. Next, drag and drop the archive to the window. The archive size should not exceed 500 MB.
- Click Start deployment.
- Base model name.
- LoRA adapter name.
- Random suffix that is used to make the model name unique.
meta-llama/Llama-3.1-8B-Instruct base model and the test-adapter name for a LoRA adapter, the name of the custom model is meta-llama/Llama-3.1-8B-Instruct-LoRa:test-adapter-AbCd.
How to deploy a model fine-tuned in Nebius AI Studio
After you fine-tune a model in Nebius AI Studio, you can deploy the resulting model.- Python
- UI
Run the following Python script:The script stages are the following:
-
Deploys a fine-tuned model from a specific fine-tuning job. In the
create_lora_from_jobmethod, specify the following:-
Name for the LoRA adapter, for example,
test-adapter. -
Fine-tuning job ID. You can find this ID in the response when you create a fine-tuning job:
-
Checkpoint ID. You can get this ID from a successful fine-tuning job:
- Base model name that you set when you created the fine-tuning job.
-
Name for the LoRA adapter, for example,
-
Waits for the custom model to be validated. The model first receives the
validatingstatus. When the model is validated, the status changes toactive. If the uploaded data is invalid, the status changes toerrorandstatus_reasoncontains the error message. -
Gets the custom model name.
The name of a deployed model is composed of the following components:
- Base model name.
- LoRA adapter name.
- Random suffix that is used to make the model name unique.
meta-llama/Llama-3.1-8B-Instructbase model and thetest-adaptername for a LoRA adapter, the name of the custom model ismeta-llama/Llama-3.1-8B-Instruct-LoRa:test-adapter-AbCd. - Creates a multi-message request by using the custom model name.
How to delete a deployed custom model
- Python
- UI