Training an ML Model

A training dataset is required to train an ML model. See Creating Training Datasets.

To train an ML model, follow these steps:

  1. Log in to the Aizen Jupyter console. See Using the Aizen Jupyter Console.

  2. Set the current working project.

    set project <project name>
  3. Configure a training experiment by running the configure training command:

    configure training
  4. In the notebook, you will be guided through a template form with boxes and drop-down lists that you can complete to define the experiment. At a minimum, you must choose either Machine Learning or Deep Learning, and you must specify the input features and the output (or label) features from the dataset. Additionally, you may specify the feature types and specify several options in the advanced settings.

  5. Execute the training experiment using the start training command to schedule a job. Optionally, you may configure resources for the job by running the configure resource command. If you do not configure resources, default resource settings will be applied.

    configure resource
    start training <experiment name>
  6. While the job is running, you may check the job status and check the training progress on TensorBoard or MLflow by using the URL shown in the status training command or by listing the TensorBoard or MLflow URL for the training run.

    status training <experiment name>
    list tensorboard <ML model name>,<run id>
    list mlflow <ML model name>,<run id>
  7. Wait for the job to complete, and then check your training results:

    status training <experiment name>
  8. List the trained models and display training results:

    list trained-models <ML model name>
    list trained-model <ML model name>
  9. You may compare the results of various training runs of a given ML model using the TensorBoard or MLflow URL for the ML model:

    list tensorboard <ML model name>
    list mlflow <ML model name>

Last updated