LogoLogo
Have questions?📞 Speak with a specialist.📅 Book a demo now.
  • Welcome
  • INTRODUCTION
    • What Is Aizen?
    • Aizen Platform Interfaces
    • Typical ML Workflow
    • Datasets and Features
    • Resources and GPUs
    • LLM Operations
    • Glossary
  • INSTALLATION
    • Setting Up Your Environment
      • Hardware Requirements
      • Deploying Kubernetes On Prem
      • Deploying Kubernetes on AWS
      • Deploying Kubernetes on GCP
        • GCP and S3 API Interoperability
        • Provisioning the Cloud Service Mesh
        • Installing Ingress Gateways with Istio
      • Deploying Kubernetes on Azure
        • Setting Up Azure Blob Storage
    • Installing Aizen
      • Software Requirements
      • Installing the Infrastructure Components
      • Installing the Core Components
      • Virtual Services and Gateways Command Script (GCP)
      • Helpful Deployment Commands
    • Installing Aizen Remote Components
      • Static Remote Deployment
      • Dynamic Remote Deployment
    • Installing Optional Components
      • MinIO
      • OpenLDAP
      • OpenEBS Operator
      • NGINX Ingress Controller
      • Airbyte
  • GETTING STARTED
    • Managing Users and Roles
      • Aizen Security
      • Adding Users
      • Updating Users
      • Listing Users and Roles
      • Granting or Revoking Roles
      • Deleting Users
    • Accessing the Aizen Platform
    • Using the Aizen Jupyter Console
  • MANAGING ML WORKFLOWS
    • ML Workflow
    • Configuring Data Sources
    • Configuring Data Sinks
    • Creating Training Datasets
    • Performing ML Data Analysis
    • Training an ML Model
    • Adding Real-Time Data Sources
    • Serving an ML Model
    • Training and Serving Custom ML Models
  • MANAGING LLM WORKFLOWS
    • LLM Workflow
    • Configuring Data Sources
    • Creating Training Datasets for LLMs
    • Fine-Tuning an LLM
    • Serving an LLM
    • Adding Cloud Providers
    • Configuring Vector Stores
    • Running AI Agents
  • Notebook Commands Reference
    • Notebook Commands
  • SYSTEM CONFIGURATION COMMANDS
    • License Commands
      • check license
      • install license
    • Authorization Commands
      • add users
      • alter users
      • list users
      • grant role
      • list roles
      • revoke role
      • delete users
    • Cloud Provider Commands
      • add cloudprovider
      • list cloudproviders
      • list filesystems
      • list instancetypes
      • status instance
      • list instance
      • list instances
      • delete cloudprovider
    • Project Commands
      • create project
      • alter project
      • exportconfig project
      • importconfig project
      • list projects
      • show project
      • set project
      • listconfig all
      • status all
      • stop all
      • delete project
      • shutdown aizen
    • File Commands
      • install credentials
      • list credentials
      • delete credentials
      • install preprocessor
  • MODEL BUILDING COMMANDS
    • Data Source Commands
      • configure datasource
      • describe datasource
      • listconfig datasources
      • delete datasource
    • Data Sink Commands
      • configure datasink
      • describe datasink
      • listconfig datasinks
      • alter datasink
      • start datasink
      • status datasink
      • stop datasink
      • list datasinks
      • display datasink
      • delete datasink
    • Dataset Commands
      • configure dataset
      • describe dataset
      • listconfig datasets
      • exportconfig dataset
      • importconfig dataset
      • start dataset
      • status dataset
      • stop dataset
      • list datasets
      • display dataset
      • export dataset
      • import dataset
      • delete dataset
    • Data Analysis Commands
      • loader
      • show stats
      • show datatypes
      • show data
      • show unique
      • count rows
      • count missingvalues
      • plot
      • run analysis
      • run pca
      • filter dataframe
      • list dataframes
      • set dataframe
      • save dataframe
    • Training Commands
      • configure training
      • describe training
      • listconfig trainings
      • start training
      • status training
      • list trainings
      • list tensorboard
      • start tensorboard
      • stop tensorboard
      • stop training
      • restart training
      • delete training
      • list mlflow
      • save embedding
      • list trained-models
      • list trained-model
      • export trained-model
      • import trained-model
      • delete trained-model
      • register model
      • update model
      • list registered-models
      • list registered-model
  • MODEL SERVING COMMANDS
    • Resource Commands
      • configure resource
      • describe resource
      • listconfig resources
      • alter resource
      • delete resource
    • Prediction Commands
      • configure prediction
      • describe prediction
      • listconfig predictions
      • start prediction
      • status prediction
      • test prediction
      • list predictions
      • stop prediction
      • list prediction-logs
      • display prediction-log
      • delete prediction
    • Data Report Commands
      • configure datareport
      • describe datareport
      • listconfig datareports
      • start datareport
      • list data-quality
      • list data-drift
      • list target-drift
      • status data-quality
      • display data-quality
      • status data-drift
      • display data-drift
      • status target-drift
      • display target-drift
      • delete datareport
    • Runtime Commands
      • configure runtime
      • describe runtime
      • listconfig runtimes
      • start runtime
      • status runtime
      • stop runtime
      • delete runtime
  • LLM AND EMBEDDINGS COMMANDS
    • LLM Commands
      • configure llm
      • listconfig llms
      • describe llm
      • start llm
      • status llm
      • stop llm
      • delete llm
    • Vector Store Commands
      • configure vectorstore
      • describe vectorstore
      • listconfig vectorstores
      • start vectorstore
      • status vectorstore
      • stop vectorstore
      • delete vectorstore
    • LLM Application Commands
      • configure llmapp
      • describe llmapp
      • listconfig llmapps
      • start llmapp
      • status llmapp
      • stop llmapp
      • delete llmapp
  • TROUBLESHOOTING
    • Installation Issues
Powered by GitBook
On this page
  • ML Workflow Steps
  • Configure Data Sources
  • Configure Data Sinks
  • Configure and Create Training Datasets
  • Train Model
  • Add Real-Time Data Sources
  • Serve Model
  1. INTRODUCTION

Typical ML Workflow

PreviousAizen Platform InterfacesNextDatasets and Features

Last updated 3 months ago

© 2025 Aizen Corporation

This diagram shows a typical real-time machine learning (ML) workflow in Aizen.

ML Workflow Steps

The ML workflow consists of these steps:

1

Configure Data Sources

  • Define data sources and connect your data sources to the Aizen platform. These data sources contain historical data that will be used to train your ML model. The data sources are typically database tables or CSV files that are external to the Aizen platform.

2

Configure Data Sinks

  • Configure data sinks in Aizen. A data sink is a table in Aizen storage. Each data sink connects to a data source.

  • Define constraints and metrics on the data sink. Constraints specify the data checks to perform when pulling data from a data source into a data sink. Metrics specify the data analytics to perform on data while it is being placed in the data sink.

3

Configure and Create Training Datasets

  • Configure a dataset and define ML features for the dataset.

  • Create the training dataset. This action materializes the ML features that were defined for that dataset. It is achieved by scheduling a job and then waiting for the job to complete.

  • Explore training datasets. You may explore and analyze the training dataset to check metrics such as minimum, maximum, average, missing values or correlation of features and labels.

4

Train Model

  • Configure a training experiment. Aizen is an auto-ML platform. At a minimum, you just specify the input feature names and the output (or label) feature names, and whether you want to use machine learning (ML) or deep learning (DL) algorithms.

  • Train an ML model. This action runs the training experiment that was configured. It is achieved by scheduling a job and then waiting for the job to complete.

  • View training results. You may view the trained model results and metrics using tools such as TensorBoard and Mlflow.

  • If the training results are optimal, proceed to Step 5. Otherwise, repeat steps 1 to 4.

5

Add Real-Time Data Sources

You can skip this step if you do not have real-time data.

  • Configure real-time data sources to connect to data sinks. The real-time data sources, such as a Kafka stream, must provide real-time data that corresponds to the historical data sources that you used in Step 2. You may skip this step if you do not have real-time data.

  • Add the real-time data source to the data sink. This action runs a job that periodically and continuously fetches data from the real-time data source and stores it in the data sink. Any ML features that were defined from the data sink can now be materialized with real-time data.

6

Serve Model

  • Register a trained ML model. Models must be registered for deployment.

  • Configure a prediction deployment. This defines the name of the deployment, along with the registered ML model name and version. The configured prediction is deployed by scheduling a job. The job status will provide the URL that external prediction applications can use to make prediction requests.

  • Explore the prediction log. You may explore the prediction log for data drift analysis and prediction accuracy.

Aizen Real-Time ML Workflow