Kubeflow Pipelines

Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers. Deploying ML Workflows has often been challenging and Kubeflow Pipelines solves this problem of creating and running Machine Learning Workflows on Kubernetes clusters. Kubeflow Pipelines provides a nice UI where you can create/run and manage jobs that in turn run as pods on a kubernetes cluster. User can view the graphs and the whole execution workflow from the UI itself.

As per Kubeflow Pipeline Documentation:

The Kubeflow Pipelines platform consists of:

  1. A user interface (UI) for managing and tracking experiments, jobs, and runs.
  2. An engine for scheduling multi-step ML workflows.
  3. An SDK for defining and manipulating pipelines and components.
  4. Notebooks for interacting with the system using the SDK.

The following are the goals of Kubeflow Pipelines:

  1. End-to-end orchestration: enabling and simplifying the orchestration of machine learning pipelines.
  2. Easy experimentation: making it easy for you to try numerous ideas and techniques and manage your various trials/experiments.
  3. Easy re-use: enabling you to re-use components and pipelines to quickly create end-to-end solutions without having to rebuild each time.

In this Learn guide I will walk you through the standalone Kubeflow Pipeline Installation on to a Civo Managed k3s cluster.

Cluster creation

First we will create a Civo cluster using the Civo CLI, which you can download and install from here

civo kubernetes create kubeflow-pipeline
The cluster kubeflow-pipeline (2aa555eb-9016-4fd4-9e48-027fa767efb7) has been created
#save kubeconfig
civo kubernetes config kubeflow-pipeline -s 
export KUBECONFIG=~/.kube/config

kubectl get nodes  
NAME               STATUS   ROLES    AGE     VERSION
kube-master-e06e   Ready    master   9m48s   v1.18.6+k3s1
kube-node-b69c     Ready    <none>   9m11s   v1.18.6+k3s1
kube-node-0e18     Ready    <none>   9m1s    v1.18.6+k3s1

Kubeflow Pipeline Installation

export PIPELINE_VERSION=1.0.1

# Installing all the CRD's
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"

namespace/kubeflow created
customresourcedefinition.apiextensions.k8s.io/applications.app.k8s.io created
customresourcedefinition.apiextensions.k8s.io/clusterworkflowtemplates.argoproj.io created
customresourcedefinition.apiextensions.k8s.io/cronworkflows.argoproj.io created
customresourcedefinition.apiextensions.k8s.io/scheduledworkflows.kubeflow.org created
customresourcedefinition.apiextensions.k8s.io/viewers.kubeflow.org created
customresourcedefinition.apiextensions.k8s.io/workflows.argoproj.io created
customresourcedefinition.apiextensions.k8s.io/workflowtemplates.argoproj.io created
serviceaccount/kubeflow-pipelines-cache-deployer-sa created
clusterrole.rbac.authorization.k8s.io/kubeflow-pipelines-cache-deployer-clusterrole created
clusterrolebinding.rbac.authorization.k8s.io/kubeflow-pipelines-cache-deployer-clusterrolebinding created

# Installing all the Kubernetes Objects 
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"
serviceaccount/application created
serviceaccount/argo created
serviceaccount/kubeflow-pipelines-cache created
serviceaccount/kubeflow-pipelines-container-builder created
serviceaccount/kubeflow-pipelines-metadata-writer created
serviceaccount/kubeflow-pipelines-viewer created
serviceaccount/ml-pipeline-persistenceagent created
serviceaccount/ml-pipeline-scheduledworkflow created
serviceaccount/ml-pipeline-ui created
serviceaccount/ml-pipeline-viewer-crd-service-account created
serviceaccount/ml-pipeline-visualizationserver created
serviceaccount/ml-pipeline created
serviceaccount/pipeline-runner created
role.rbac.authorization.k8s.io/application-manager-role created
role.rbac.authorization.k8s.io/argo-role created
role.rbac.authorization.k8s.io/kubeflow-pipelines-cache-deployer-role created
role.rbac.authorization.k8s.io/kubeflow-pipelines-cache-role created
role.rbac.authorization.k8s.io/kubeflow-pipelines-metadata-writer-role created
role.rbac.authorization.k8s.io/ml-pipeline-persistenceagent-role created
role.rbac.authorization.k8s.io/ml-pipeline-scheduledworkflow-role created
role.rbac.authorization.k8s.io/ml-pipeline-ui created
role.rbac.authorization.k8s.io/ml-pipeline-viewer-controller-role created
role.rbac.authorization.k8s.io/ml-pipeline created
role.rbac.authorization.k8s.io/pipeline-runner created
rolebinding.rbac.authorization.k8s.io/application-manager-rolebinding created
rolebinding.rbac.authorization.k8s.io/argo-binding created
rolebinding.rbac.authorization.k8s.io/kubeflow-pipelines-cache-binding created
rolebinding.rbac.authorization.k8s.io/kubeflow-pipelines-cache-deployer-rolebinding created
rolebinding.rbac.authorization.k8s.io/kubeflow-pipelines-metadata-writer-binding created
rolebinding.rbac.authorization.k8s.io/ml-pipeline-persistenceagent-binding created
rolebinding.rbac.authorization.k8s.io/ml-pipeline-scheduledworkflow-binding created
rolebinding.rbac.authorization.k8s.io/ml-pipeline-ui created
rolebinding.rbac.authorization.k8s.io/ml-pipeline-viewer-crd-binding created
rolebinding.rbac.authorization.k8s.io/ml-pipeline created
rolebinding.rbac.authorization.k8s.io/pipeline-runner-binding created
configmap/metadata-grpc-configmap created
configmap/ml-pipeline-ui-configmap created
configmap/pipeline-install-config-68h9cgfc7d created
configmap/workflow-controller-configmap created
secret/mlpipeline-minio-artifact created
secret/mysql-secret-fd5gktm75t created
service/cache-server created
service/controller-manager-service created
service/metadata-envoy-service created
service/metadata-grpc-service created
service/minio-service created
service/ml-pipeline-ui created
service/ml-pipeline-visualizationserver created
service/ml-pipeline created
service/mysql created
deployment.apps/cache-deployer-deployment created
deployment.apps/cache-server created
deployment.apps/controller-manager created
deployment.apps/metadata-envoy-deployment created
deployment.apps/metadata-grpc-deployment created
deployment.apps/metadata-writer created
deployment.apps/minio created
deployment.apps/ml-pipeline-persistenceagent created
deployment.apps/ml-pipeline-scheduledworkflow created
deployment.apps/ml-pipeline-ui created
deployment.apps/ml-pipeline-viewer-crd created
deployment.apps/ml-pipeline-visualizationserver created
deployment.apps/ml-pipeline created
deployment.apps/mysql created
deployment.apps/workflow-controller created
application.app.k8s.io/pipeline created
persistentvolumeclaim/minio-pvc created
persistentvolumeclaim/mysql-pv-claim created

# Apply Ingress, editing in your cluster ID to the host line
kubectl apply -f - << EOF
  apiVersion: "networking.k8s.io/v1beta1"
  kind: "Ingress"
  metadata:
    name: "example-ingress"
    namespace: kubeflow
    annotations:
      nginx.ingress.kubernetes.io/rewrite-target: /$2
  spec:
    ingressClassName: "traefik-lb"
    rules:
    - host: pipelines.{cluster-Id}.k8s.civo.com
      http:
        paths:
        - path: "/"
          backend:
            serviceName: "ml-pipeline-ui"
            servicePort: 80
EOF

ingress.networking.k8s.io/example-ingress created

kubectl get ing -A
NAMESPACE   NAME              CLASS        HOSTS                                                         ADDRESS           PORTS   AGE
kubeflow    example-ingress   traefik-lb   pipelines.2aa555eb-9016-4fd4-9e48-027fa767efb7.k8s.civo.com   185.136.232.203   80      18s

Now you can Access the Kubeflow pipeline with the HOST provided in the ingress object.

Kubeflow Dashboard

Go to Sample Pipeline and Click Create Run Kubeflow Sample Pipeline

Click Start Run options

You will be able to see the complete execution flow Run results

Also can also view input/output for each workflow component Workflow components

All the components run as Kubernetes pods

kubeflow      file-passing-pipelines-6f4bk-2984671871           0/2     Completed   0          2m50s
kubeflow      file-passing-pipelines-6f4bk-3546137079           0/2     Completed   0          2m31s
kubeflow      file-passing-pipelines-6f4bk-3529359460           0/2     Completed   0          2m31s
kubeflow      file-passing-pipelines-6f4bk-927521251            0/2     Completed   0          2m50s
kubeflow      file-passing-pipelines-6f4bk-4034404647           0/2     Completed   0          2m50s
kubeflow      file-passing-pipelines-6f4bk-215922839            0/2     Completed   0          2m10s
kubeflow      file-passing-pipelines-6f4bk-965243784            0/2     Completed   0          2m9s
kubeflow      file-passing-pipelines-6f4bk-3512581841           0/2     Completed   0          2m10s
kubeflow      file-passing-pipelines-6f4bk-3495804222           0/2     Completed   0          2m4s

This is how you can run Kubeflow Pipelines on Civo managed k3s cluster. All the commands used above are taken from a new project called k3ai, a very nice initiative to simply running of Kubeflow Pipelines and even Kubeflow on the whole in future in a easy way on k3s.