LogoLogo
Have questions?📞 Speak with a specialist.📅 Book a demo now.
  • Welcome
  • INTRODUCTION
    • What Is Aizen?
    • Aizen Platform Interfaces
    • Typical ML Workflow
    • Datasets and Features
    • Resources and GPUs
    • LLM Operations
    • Glossary
  • INSTALLATION
    • Setting Up Your Environment
      • Hardware Requirements
      • Deploying Kubernetes On Prem
      • Deploying Kubernetes on AWS
      • Deploying Kubernetes on GCP
        • GCP and S3 API Interoperability
        • Provisioning the Cloud Service Mesh
        • Installing Ingress Gateways with Istio
      • Deploying Kubernetes on Azure
        • Setting Up Azure Blob Storage
    • Installing Aizen
      • Software Requirements
      • Installing the Infrastructure Components
      • Installing the Core Components
      • Virtual Services and Gateways Command Script (GCP)
      • Helpful Deployment Commands
    • Installing Aizen Remote Components
      • Static Remote Deployment
      • Dynamic Remote Deployment
    • Installing Optional Components
      • MinIO
      • OpenLDAP
      • OpenEBS Operator
      • NGINX Ingress Controller
      • Airbyte
  • GETTING STARTED
    • Managing Users and Roles
      • Aizen Security
      • Adding Users
      • Updating Users
      • Listing Users and Roles
      • Granting or Revoking Roles
      • Deleting Users
    • Accessing the Aizen Platform
    • Using the Aizen Jupyter Console
  • MANAGING ML WORKFLOWS
    • ML Workflow
    • Configuring Data Sources
    • Configuring Data Sinks
    • Creating Training Datasets
    • Performing ML Data Analysis
    • Training an ML Model
    • Adding Real-Time Data Sources
    • Serving an ML Model
    • Training and Serving Custom ML Models
  • MANAGING LLM WORKFLOWS
    • LLM Workflow
    • Configuring Data Sources
    • Creating Training Datasets for LLMs
    • Fine-Tuning an LLM
    • Serving an LLM
    • Adding Cloud Providers
    • Configuring Vector Stores
    • Running AI Agents
  • Notebook Commands Reference
    • Notebook Commands
  • SYSTEM CONFIGURATION COMMANDS
    • License Commands
      • check license
      • install license
    • Authorization Commands
      • add users
      • alter users
      • list users
      • grant role
      • list roles
      • revoke role
      • delete users
    • Cloud Provider Commands
      • add cloudprovider
      • list cloudproviders
      • list filesystems
      • list instancetypes
      • status instance
      • list instance
      • list instances
      • delete cloudprovider
    • Project Commands
      • create project
      • alter project
      • exportconfig project
      • importconfig project
      • list projects
      • show project
      • set project
      • listconfig all
      • status all
      • stop all
      • delete project
      • shutdown aizen
    • File Commands
      • install credentials
      • list credentials
      • delete credentials
      • install preprocessor
  • MODEL BUILDING COMMANDS
    • Data Source Commands
      • configure datasource
      • describe datasource
      • listconfig datasources
      • delete datasource
    • Data Sink Commands
      • configure datasink
      • describe datasink
      • listconfig datasinks
      • alter datasink
      • start datasink
      • status datasink
      • stop datasink
      • list datasinks
      • display datasink
      • delete datasink
    • Dataset Commands
      • configure dataset
      • describe dataset
      • listconfig datasets
      • exportconfig dataset
      • importconfig dataset
      • start dataset
      • status dataset
      • stop dataset
      • list datasets
      • display dataset
      • export dataset
      • import dataset
      • delete dataset
    • Data Analysis Commands
      • loader
      • show stats
      • show datatypes
      • show data
      • show unique
      • count rows
      • count missingvalues
      • plot
      • run analysis
      • run pca
      • filter dataframe
      • list dataframes
      • set dataframe
      • save dataframe
    • Training Commands
      • configure training
      • describe training
      • listconfig trainings
      • start training
      • status training
      • list trainings
      • list tensorboard
      • start tensorboard
      • stop tensorboard
      • stop training
      • restart training
      • delete training
      • list mlflow
      • save embedding
      • list trained-models
      • list trained-model
      • export trained-model
      • import trained-model
      • delete trained-model
      • register model
      • update model
      • list registered-models
      • list registered-model
  • MODEL SERVING COMMANDS
    • Resource Commands
      • configure resource
      • describe resource
      • listconfig resources
      • alter resource
      • delete resource
    • Prediction Commands
      • configure prediction
      • describe prediction
      • listconfig predictions
      • start prediction
      • status prediction
      • test prediction
      • list predictions
      • stop prediction
      • list prediction-logs
      • display prediction-log
      • delete prediction
    • Data Report Commands
      • configure datareport
      • describe datareport
      • listconfig datareports
      • start datareport
      • list data-quality
      • list data-drift
      • list target-drift
      • status data-quality
      • display data-quality
      • status data-drift
      • display data-drift
      • status target-drift
      • display target-drift
      • delete datareport
    • Runtime Commands
      • configure runtime
      • describe runtime
      • listconfig runtimes
      • start runtime
      • status runtime
      • stop runtime
      • delete runtime
  • LLM AND EMBEDDINGS COMMANDS
    • LLM Commands
      • configure llm
      • listconfig llms
      • describe llm
      • start llm
      • status llm
      • stop llm
      • delete llm
    • Vector Store Commands
      • configure vectorstore
      • describe vectorstore
      • listconfig vectorstores
      • start vectorstore
      • status vectorstore
      • stop vectorstore
      • delete vectorstore
    • LLM Application Commands
      • configure llmapp
      • describe llmapp
      • listconfig llmapps
      • start llmapp
      • status llmapp
      • stop llmapp
      • delete llmapp
  • TROUBLESHOOTING
    • Installation Issues
Powered by GitBook

© 2025 Aizen Corporation

On this page
  • Training a Custom ML Model
  • Serving a Custom ML Model
  1. MANAGING ML WORKFLOWS

Training and Serving Custom ML Models

Aizen recommends auto-ML for model training and serving because it handles most scenarios well and is easy to execute. Use the configure training and configure prediction commands to build an auto-ML pipeline.

If you need specialized ML models, which are not supported by auto-ML, then you may create custom ML models. Aizen provides a configure runtime command that supports custom ML model training and serving. This command deploys a runtime service that provides a JupyterLab notebook interface for you to train a custom ML model. This is a mechanism very similar to Google’s Colab service.

Training a Custom ML Model

To train a custom ML model, follow these steps:

  1. Log in to the Aizen Jupyter console. See Using the Aizen Jupyter Console.

  2. Set the current working project.

    set project <project name>
  3. Configure a runtime using the configure runtime command. In the notebook, you will be guided through a template form with boxes and drop-down lists that you can complete to define the runtime. Specify the runtime name and select Model Training. Specify any pip install requirements file that needs to be run during initialization. Specify any datasets that need to be made available in the runtime. If required, specify a previously trained model name and the run that needs to be made available in the runtime.

    configure runtime
  4. Configure a resource for the runtime and start the runtime:

    configure resource
    start runtime <runtime name>
  5. Check the status of the runtime. This output will provide URLs for you to access a JupyterLab notebook connected to the runtime:

    status runtime <runtime name>
  6. Connect to the JupyterLab notebook URL, which is displayed in the status runtime output. This is a separate JupyterLab notebook from the Aizen Jupyter console. You may perform pip installs, data analysis, and any custom ML model training in the runtime JupyterLab notebook.

    Aizen provides a Python module with some helper functions. There are functions to list datasets, load datasets, save a trained custom ML model to the Aizen platform, and load a saved model from the Aizen platform. Use the import aizen command to access this module in the runtime JupyterLab notebook.

    import aizen
    aizen.help()
    aizen.help(“save_model”)
    aizen.help(“list_datasets”)
  7. After training your custom ML model, save the model using the Aizen Python module:

    import aizen
    aizen.save_model(…)
  8. If you plan to serve the trained model, create a serving Python module with pydantic functions to serve the model for inference requests. The serving module will be needed later when deploying the model. Each function is advertised as a FastAPI endpoint for REST requests. The functions must be adequately annotated. The serving Python module must load the model by calling the Aizen read_model function.

    import aizen
    aizen.read_model(…)

    Test the model serving module using the start and stop model serve functions:

    import aizen
    aizen.start_model_serve(…)
    aizen.stop_model_serve(…)

    After you have tested the serving Python module, download the serving Python module to your laptop and upload it to the Aizen Jupyter console notebook.

  9. Stop the runtime when it is no longer needed. This will terminate the runtime and any data associated with it. The runtime does not provide permanent storage. Any models not saved using the Aizen save_model will be lost. Any data downloaded into the runtime, other than via Aizen datasets, will not be retained. Any code developed in the runtime, such as serving Python modules, will not be retained. You must download serving Python modules to your laptop and upload them to the console notebook. The runtime JupyterLab notebook that trained the custom ML model will not be retained. It is recommended that you save the notebook in GitHub, cloud storage, or some other permanent storage.

    stop runtime <runtime name>

Serving a Custom ML Model

To serve a custom ML model, follow these steps:

  1. Log in to the Aizen Jupyter console. See Using the Aizen Jupyter Console.

  2. Set the current working project.

    set project <project name>
  3. Configure a runtime using the configure runtime command. In the notebook, you will be guided through a template form with boxes and drop-down lists that you can complete to define the runtime. Specify the runtime name and select Model Serving. Specify any pip install requirements file that needs to be run during initialization. Specify the trained model name and the run that needs to be made available in the runtime. Select the serving Python module and pydantic functions that serve the trained model. Each function is advertised as a FastAPI endpoint for REST requests. The functions must be adequately annotated. The serving Python module must load the model by calling the Aizen read_model function. During development, the serving module can be coded and tested in a runtime that was configured for training.

    configure runtime
  4. Configure a resource for the runtime and start the runtime:

    configure resource
    start runtime <runtime name>
  5. Check the status of the runtime. This output will provide URLs that provide REST endpoints for inference requests to the serving module.

    status runtime <runtime name>
PreviousServing an ML ModelNextLLM Workflow

Last updated 3 months ago