Installing Lamini Platform on Kubernetes

Note

The Lamini installer is only available when self-managing Lamini Platform. Contact us to learn more.

Lamini Platform on Kubernetes enables multi-node, multi-GPU inference and training running on your own GPUs, in the environment of your choice.

Prerequisites

Tools

You need to have a working Kubernetes cluster, and python, helm, kubectl installed.

Lamini Self-Managed license

Contact us for access to the Kubernetes installer to host Lamini Platform on your own GPUs or in your cloud VPC.

Hardware system requirements

Storage: 1 TB or more of high-speed network storage with low latency (around 1 ms), such as a Network File System (NFS).
GPU: 48 GB or more of memory per GPU.

PVC

Lamini requires a RWX PVC for storing all runtime data. You can use NFS and other storage solutions. For example, you can set up a simple provisioner using nfs-subdir-external-provisioner:

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner \
    --set nfs.server=<NFS_IP> \
    --set nfs.path=<NFS_SUBFOLDER_PATH>

Then proceed to create a PVC lamini-volume with ReadWriteMany access for installing Lamini Platform.

GPU Operator

For AMD:

git clone https://github.com/ROCm/k8s-device-plugin.git
kubectl create -f k8s-device-plugin/k8s-ds-amdgpu-dp.yaml

For NVIDIA:

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
  && helm repo update
helm install --wait --generate-name \
  -n gpu-operator --create-namespace \
  nvidia/gpu-operator

Installation Steps

Obtain the installer

Ask your Sales Engineer for the installer. The installer is a .tar.gz compressed file with all helm charts and scripts for installing Lamini Platform. You should work with your Sales Engineer for each installation or upgrade of Lamini Platform.

You should keep any changes to the installer in a private repository for tracking purposes. Also ask your Sales Engineer to keep track of such changes.

After obtaining the installer, extract it to a directory of your choice:

# Make sure the installer name is used for directory name
# so that you can track the version of the installer.
INSTALLER_NAME="<name>"
mkdir -p ${INSTALLER_NAME}
tar -xzf ${INSTALLER_NAME}.tar.gz -C ${INSTALLER_NAME}

The rest of the instructions are in the INSTALL.md file in the installer. You should operate under the directory of the installer.

# Change to the installer directory
cd ${INSTALLER_NAME}/lamini-kube-installer

# Read the INSTALL.md file, open with your favorite editor
vi INSTALL.md

Update `helm_config.yaml`

Optional: If you already have nfssubdir-external-provisioner installed, set the pvc_provisioner to the storageclass name of defined by your installed nfs-subdir-external-provisioner.

helm_config.yaml

pvc_provisioner: nfs-client

If you opt to use Lamini Platform provided NFS pvc provisioner, set the pvcLamini.name to the name of the PVC you want to use, and set create to True, and set size to the recommended 200Gi, or work with your Sales Engineer to determine the size:

helm_config.yaml

pvcLamini: {
   name: lamini-volume,
   size: 200Gi,
   create: True
}

if you have already created a PVC, set name to the name of the PVC, set create to False, you can omit size:

helm_config.yaml

pvcLamini: {
   name: lamini-volume,
   create: False
}

Confirm the top-level platform type (one of: amd, nvidia, or cpu) matches your hardware.

helm_config.yaml

type: "amd"

Update the distribution of inference pods.

helm_config.yaml

inference: {
   type: ClusterIP,
   batch: 1,
   streaming: 1,
   embedding: 1,
   catchall: 1
}

The example above would create 4 pods using 4 GPUs in total. Each pod has 1 GPU. The example shows 1 inference pod allocated to batch inference, 1 pod dedicated only to streaming inference, 1 dedicated only to embedding inference (also used in classification), and 1 for the catchall pod, which is intended to handle requests for models that have not been preloaded on the batch pod. See Model Management for more details.

Update the number of training pods and number of GPUs per pod:

helm_config.yaml

training: {
   type: ClusterIP,
   num_pods: 1,
   num_gpus_per_pod: 8
}

We recommend minimizing the number of pods per node. For example, instead of 2 pods with 4 GPUs, it's better to create 1 pod with all 8 GPUs.

Update the node affinity for the Lamini deployment. These are the nodes where Lamini pods will be deployed:

helm_config.yaml

nodes: [
   "node0"
]

(Optional) If you want to use a custom ingress pathway, update the ingress field:

helm_config.yaml

ingress: 'ingress/pathway'

Generate Helm Charts and install Lamini Platform

Follow the INSTALL.md included in the installer for the detailed steps. The general steps are:

Generate Helm charts with the provided shell script
Install Lamini Platform with helm install or upgrade with helm upgrade

Installing Lamini Platform on Kubernetes

Prerequisites

Tools

Lamini Self-Managed license

Hardware system requirements

PVC

GPU Operator

Installation Steps

Obtain the installer

Update helm_config.yaml

Generate Helm Charts and install Lamini Platform

Update `helm_config.yaml`