LLM Operations
Last updated
Last updated
The Aizen platform supports fine-tuning and serving large language models (LLMs). You can configure and deploy AI agents on Aizen. Agents can be configured to use LLMs hosted on the Aizen platform or LLMs hosted by service providers, such as OpenAI.
This diagram shows an Aizen LLM operations (LLMOps) deployment:
Fine-tune LLMs by configuring and starting a training experiment. You can use Any Hugging Face model from the Hugging Face Hub as the base model for fine-tuning.
You can specify a prompt template when configuring the experiment. Alternatively, you can include the prompt in the dataset itself.
You can specify fine-tuning parameters, such as the adapter type, quantization bits, scaling, and sampling.
LLMs can be served for inference on Aizen. The model being served can be an LLM that was fine-tuned on Aizen or a pretrained LLM from the Hugging Face Hub that was not fine-tuned. Embeddings models from the Hugging Face Hub can also be served on Aizen.
If GPU hosting is required, you can assign GPU resources for the LLM deployment.
Aizen provides vector store services. You can create vector stores and then Store IDs within a vector store. A Store ID is a repository of documents. In database terms, a vector store is a database, and a Store ID is a table.
The vector store service supports operations such as uploading documents to a Store ID, scraping websites, uploading from S3 cloud buckets, and retrieving semantically similar texts from a Store ID.
Aizen supports deploying LLM applications, such as AI agents. AI agents may interact with LLMs served by Aizen or by a non-Aizen provider, such as OpenAI. AI agents may interact with vector stores for RAG applications.
Aizen-deployed AI agents support a variety of tools, which you can easily configure using tool templates. Supported tool templates are RAG Query, REST Query, SQL Query, Web Search, and Custom Functions.