Prerequisites

  1. Choose one of the models supported for fine-tuning.
  2. **Create a dataset **for training. You can optionally create an additional dataset for validation. Split the data between two datasets as 80–90% for training and 10–20% for validation. Requirements for validation datasets are the same as for training datasets.
  3. Create an API key for authentication.
  4. Save the API key to an environment variable:
    export NEBIUS_API_KEY=<API_key>
    

How to fine-tune a model

  1. Using Python : Install openai package:
    pip3 install openai
    
  2. Import essential librarires
    import os
    from openai import OpenAI
    import time
    
  3. Set up Nebius API key
    client = OpenAI(
        base_url="https://api.studio.nebius.com/v1/",
        api_key=os.environ.get("NEBIUS_API_KEY"),
    )
    
  4. Upload a training and a validation dataset. The validation dataset is optional.
    # Upload a training dataset
    training_dataset = client.files.create(
        file=open("<dataset_name>.jsonl", "rb"), # Specify the dataset name
        purpose="fine-tune"
    )
    
    # Upload a validation dataset
    validation_dataset = client.files.create(
        file=open("<dataset_name>.jsonl", "rb"), # Specify the dataset name
        purpose="fine-tune"
    )
    
    
  5. Configure Fine-tuning parameters. For more information about the tuning job parameters, see the specification of the fine-tuning job object.
    # Fine-tuning job parameters
    job_request = {
        "model": "<...>",
        "training_file": "training_dataset.id",
        "validation_file": "validation_dataset.id",
        "hyperparameters": {
            "batch_size": "<...>",
            "learning_rate_multiplier": "<...>",
            "n_epochs": "<...>",
            "warmup_ratio": "<...>",
            "weight_decay": "<...>",
            "lora": "<True|False>",
            "lora_r": "<...>",
            "lora_alpha": "<...>",
            "lora_dropout": "<...>",
            "packing": "<True|False>",
            "max_grad_norm": "<...>",
        },
        "integrations": [{
                "type": "wandb",
                "wandb": {
                    "api_key": "<...>",
                    "project": "<...>"
                }
        }]
    }
    
  6. Create and run the finetuning job.
    # Create and run the fine-tuning job
    job = client.fine_tuning.jobs.create(**job_request)
    
  7. Checks that the job status.
    # Check for the job status
    active_statuses = ["validating_files", "queued", "running"]
    while job.status in active_statuses:
        time.sleep(15)
        job = client.fine_tuning.jobs.retrieve(job.id)
        print("current status is", job.status)
    
    print("Job ID:", job.id)
    
    The status of a freshly started job is running. The script polls the status periodically to make sure that the job status has changed to succeeded. The minimum time window between subsequent polls is 15 seconds. If the status is failed, examine the output. It describes the error and how to fix it. If the error code is 500, resubmit the job. Checks that the training has been successful. To do this, check the job events. They are created when the job status changes. You can consider the training as finished if the response contains either the Dataset processed successfully or Training completed successfully message.
  8. Retrieves the contents of the files with the fine-tuned model.
    if job.status == "succeeded":
        # Check the job events
        events = client.fine_tuning.jobs.list_events(job.id)
        print(events)
    
        for checkpoint in client.fine_tuning.jobs.checkpoints.list(job.id).data:
            print("Checkpoint ID:", checkpoint.id)
    
            # Create a directory for every checkpoint
            os.makedirs(checkpoint.id, exist_ok=True)
    
            for model_file_id in checkpoint.result_files:
                # Get the name of a model file
                filename = client.files.retrieve(model_file_id).filename
    
                # Retrieve the contents of the file
                file_content = client.files.content(model_file_id)
    
                # Save the contents into a local file
                file_content.write_to_file(filename)
    
    You get the files for every fine-tuning checkpoint. A checkpoint is created after every epoch of training a model, so you get intermediate results of the training. If you need final results, use the files from the last checkpoint. Saves the contents to files. The script creates a directory per checkpoint and saves the files into these directories.

API specification for a fine-tuning job

The object below represents the fine-tuning job specification used in the API.
{
    "model": "<...>",
    "suffix": "<...>",
    "training_file": "<...>",
    "validation_file": "<...>",
    "hyperparameters": {
        "batch_size": "<...>",
        "learning_rate_multiplier": "<...>",
        "n_epochs": "<...>",
        "warmup_ratio": "<...>",
        "weight_decay": "<...>",
        "lora": "<true|false>",
        "lora_r": "<...>",
        "lora_alpha": "<...>",
        "lora_dropout": "<...>",
        "packing": "<true|false>",
        "max_grad_norm": "<...>"
    },
    "seed": "<...>",
    "integrations": [
        {
            "type": "wandb",
            "wandb": {
                "api_key": "<...>",
                "project": "<...>"
            }
        }
    ]
}
  • model (string, required): Model to fine-tune.
  • suffix (string, optional): Suffix added to the model name (for example, my-modelor my-experiment). It helps you differentiate between fine-tuned models in their list.
  • training_file (string, required): ID of the file with the training dataset. For more information about how to prepare and upload datasets and how to get their IDs, see the following instructions:
  • validation_file (string, optional): ID of the file with the validation dataset.
  • hyperparameters (object, optional): Fine-tuning parameters:
    • batch_size (integer, optional): Number of training examples used in a batch for fine-tuning. A bigger batch size works better with bigger datasets. From 8 to 32. Default: 8.
    • learning_rate (float, optional): Learning rate for training. If you train a model in a domain in which the model has not been trained before, you may need a higher learning rate. Greater or equal to 0. Default: 0.00001.
    • n_epochs (integer, optional): Number of epochs to train on the dataset. An epoch is a cycle of going through the whole dataset for training. For example, if the number of epochs is 10, the model is trained on a given dataset 10 times. From 1 to 20. Default: 3.
    • warmup_ratio (float, optional): Percentage by which the learning rate should increase from the beginning of training. From 0 to 1. Default: 0.
    • weight_decay (float, optional): Weight decay value. Weight decay is a regularization technique that adds a penalty to the loss function and keeps fine-tuning weights small. This approach prevents overfitting and preserves generalization, so it is better suited for larger models or more complex tasks. Greater or equal to 0. Default: 0.
    • lora (boolean, optional): Whether to enable LoRA (Low-Rank Adaptation) for training. The LoRA method presumes that low-rank matrices are inserted into a pre-trained model. These matrices catch task-specific data during the training. As a result, you only train these matrices; you do not need to retrain the whole model and modify any preset fine-tuning parameters. If false, full fine-tuning is performed. Default: false.
    • lora_r (integer, optional): Rank for weights of LoRA adapters. A larger rank captures more pre-existing model weights for training. Eventually, the model is trained better, especially if it is trained for a task for which it has not been trained before. However, ranking too high can cause overfitting. From 8 to 128. Default: 8.
    • lora_alpha (integer, optional): Alpha value for training LoRA adapters. This parameter balances the influence of low-rank LoRA matrices on pre-existing model weights. If only a slight adjustment of a model is required, use a lower value. Greater or equal to 8. Default: 8.
    • lora_dropout (float, optional): LoRA dropout rate. LoRA dropout is a regularization technique that randomly omits a fraction of the model’s LoRA parameters during training. As a result, this technique helps avoid overfitting on the dataset, especially in cases when the dataset is small and the model should suit more general tasks. From 0 to 1. Default: 0.
    • packing (boolean, optional): Whether to use packing for training. With packing enabled, you can combine multiple small samples in a batch instead of having one sample per batch. This increases training efficiency. Default: true.
    • max_grad_norm (float, optional): Maximum gradient norm value used for gradient clipping. Make sure that the value is not too large or small:
      • A value that is too large causes the vanishing gradient problem. It happens when weight gradients become too small during backpropagation. As a result, the model cannot learn quickly enough.
      • A value that is too small causes the exploding gradient problem, which is the opposite to the vanishing gradient problem. Explosion happens when weight gradients get large. As a result, it leads to unstable and unoptimized training.
      Greater or equal to 0. Default: 1.
  • seed (integer, optional): Control of the LLM output reproduction. If you pass along the same seed in different requests, you achieve approximately the same results. If you use the same seed but different values of other parameters, the results of your requests might differ.
  • integrations (array, optional): Integrations that Nebius AI Studio supports for fine-tuning:
    • type (string, optional): Integration type. The possible values are the following: You can export the model training metrics to a project in Weights & Biases. Nebius AI Studio exports the metrics after you create a fine-tuning job. The service does not export system metrics or logs.
    • wandb (object, optional): Settings for the export to a project in Weights & Biases:
      • api_key (string, optional): API key from Weights & Biases. The key should be 40 characters long.
      • project (string, optional): Name of the project in Weights & Biases.