Fine-Tuning an LLM

A training dataset is required to fine-tune an LLM. See Creating Training Datasets for LLMs.

To fine tune an LLM, follow these steps:

Log in to the Aizen Jupyter console. See Using the Aizen Jupyter Console.
Set the current working project.
```
set project <project name>
```
Configure a training experiment by running the configure training command:
```
configure training
```
In the notebook, you will be guided through a template form with boxes and drop-down lists that you can complete to define the experiment. Select Deep Learning, and check the LLM Fine-Tuning box. Select the input and output features for the model.
- If the input to the LLM is a single column in the dataset, then that column can contain the entire input text, including the prompt, or you can configure a prompt template. If you supply a prompt template, you must include the name of the input column within curly brackets and indicate where in the prompt text the column contents will be inserted. For example: "Translate the input into Spanish, where input is {eng_para}". In this prompt eng_para is the name of a column in the dataset.
- If the input to the LLM is two or more columns from the dataset, you must configure a prompt template. In the prompt template, you must include each input column name within curly brackets, indicating where in the prompt text the respective column contents will be inserted. For example: "Answer the query using the additional context: {addl_context} query: {user_question}". In this prompt, addl_context and user_question are the names of columns in the dataset.
- Select the base model for the LLM. You can use Any Hugging Face model from the Hugging Face Hub as the base model for fine-tuning.
- You can select the adapter type and quantization bits. The Advanced Settings option allows you to set additional parameters, such as RoPE scaling and sampling methods.
Execute the training experiment using the start training command to schedule a job. Optionally, you can configure resources for the job by running the configure resource command. If you do not configure resources, default resource settings will be applied. GPU resources are not included in default resource settings. If you require GPUs, you must use the configure resource command.
```
configure resource
start training <experiment name>
```
While the job is running, you can check the job status and check the training progress on TensorBoard or MLflow by using the URL shown in the status training command or by listing the TensorBoard or MLflow URL for the training run.
```
status training <experiment name>
list tensorboard <ML model name>,<run id>
list mlflow <ML model name>,<run id>
```
Wait for the job to complete, and then check your training results:
```
status training <experiment name>
```

List the trained models and display training results:

list trained-models <ML model name>
list trained-model <ML model name>

You can compare the results of various training runs of a given ML model using the TensorBoard or MLflow URL for the ML model:
```
list tensorboard <ML model name>
list mlflow <ML model name>
```

PreviousCreating Training Datasets for LLMs NextServing an LLM

Last updated 4 months ago