# Fine-Tuning an LLM

{% hint style="info" %}
A training dataset is required to fine-tune an LLM. See [Creating Training Datasets for LLMs](/docs/managing-llm-workflows/creating-training-datasets-for-llms.md).
{% endhint %}

To fine tune an LLM, follow these steps:

1. Log in to the Aizen Jupyter console. See [Using the Aizen Jupyter Console](/docs/getting-started/using-the-aizen-jupyter-console.md).
2. Set the current working project.

   ```
   set project <project name>
   ```
3. Configure a training experiment by running the `configure training` command:

   ```
   configure training
   ```
4. In the notebook, you will be guided through a template form with boxes and drop-down lists that you can complete to define the experiment. Select `Deep Learning`, and check the **LLM Fine-Tuning** box. Select the input and output features for the model.
   * If the input to the LLM is a single column in the dataset, then that column can contain the entire input text, including the prompt, or you can configure a prompt template. If you supply a prompt template, you must include the name of the input column within curly brackets and indicate where in the prompt text the column contents will be inserted. For example: `"Translate the input into Spanish, where input is {eng_para}"`. In this prompt `eng_para` is the name of a column in the dataset.
   * If the input to the LLM is two or more columns from the dataset, you must configure a prompt template. In the prompt template, you must include each input column name within curly brackets, indicating where in the prompt text the respective column contents will be inserted. For example: `"Answer the query using the additional context: {addl_context} query: {user_question}"`. In this prompt, `addl_context` and `user_question` are the names of columns in the dataset.
   * Select the base model for the LLM. You can use Any Hugging Face model from the Hugging Face Hub as the base model for fine-tuning.
   * You can select the adapter type and quantization bits. The **Advanced Settings** option allows you to set additional parameters, such as RoPE scaling and sampling methods.
5. Execute the training experiment using the `start training` command to schedule a job. Optionally, you can configure resources for the job by running the `configure resource` command. If you do not configure resources, default resource settings will be applied. GPU resources are not included in default resource settings. If you require GPUs, you must use the `configure resource` command.

   ```
   configure resource
   start training <experiment name>
   ```
6. While the job is running, you can check the job status and check the training progress on TensorBoard or MLflow by using the URL shown in the `status training` command or by listing the TensorBoard or MLflow URL for the training run.

   ```
   status training <experiment name>
   list tensorboard <ML model name>,<run id>
   list mlflow <ML model name>,<run id>
   ```
7. Wait for the job to complete, and then check your training results:

   ```
   status training <experiment name>
   ```
8. List the trained models and display training results:

   ```
   list trained-models <ML model name>
   list trained-model <ML model name>
   ```
9. You can compare the results of various training runs of a given ML model using the TensorBoard or MLflow URL for the ML model:

   ```
   list tensorboard <ML model name>
   list mlflow <ML model name>
   ```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://aizen-corp.gitbook.io/docs/managing-llm-workflows/fine-tuning-an-llm.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
