FuseML is an MLOps orchestrator powered by a flexible framework designed for consistent operations and a rich collection of integration formulas (recipes) reflecting real world use cases that help you reduce technical debt and avoid vendor lock-in.
Civo Kubernetes cluster setup
Stefan Nica our lead engineer at SUSE took some times to build the following guide. Stefan as myself is part of a new project aiming to help #mlops in their daily job. FuseML is an Open Source AI Orchestrator recently released.
For the purpose of this experiment, I used a Civo cluster with 3 medium sized nodes. I don't recommend that you go any lower than that, the results might be unexpected and you might get a lot of undesirable transient timeout errors on the k8s API, as more and more services are installed in the cluster.
NOTE DO NOT install Traefik as a default service. You also need to open ports 80 and 443 in the Civo cluster firewall to have access to FuseML and the other services.
The result should be similar to the images below:
First be sure port 80 and 443 are open.
Second is to choose the right size, this example will train an ML model so let's go for a medium size cluster.
Let's review everything and be sure Traefik is not selected.
Finally, once the cluster is deployed let's download the kubeconfig so we may connect from our own machine.
FuseML Installation
First we have to check dependencies, we'll need a couple of tools to be installed: - kubectl - helm if you are missing here the two above here the links to install them: - kubectl installation guide: https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/ - helm installation guide: https://helm.sh/docs/intro/install/
Once finish run:
#> kubectl version
#> helm version
you should obtain an ouput similar to the one below:
#> kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-22T12:00:00Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2+k3s1", GitCommit:"1d4adb0301b9a63ceec8cabb11b309e061f43d5f", GitTreeState:"clean", BuildDate:"2021-01-14T23:52:37Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
#> helm version
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: civo-fuseml-kubeconfig
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: civo-fuseml-kubeconfig
version.BuildInfo{Version:"v3.6.0", GitCommit:"7f2df6467771a75f5646b7f12afb408590ed1755", GitTreeState:"clean", GoVersion:"go1.16.3"}
Now let's setup and check access to the Civo cluster, using the previously downloaded kubeconfig:
#> export KUBECONFIG=$PWD/civo-fuseml-kubeconfig
let's test we can reach the cluster:
#> kubectl get node
NAME STATUS ROLES AGE VERSION
k3s-fuseml-a5eb4a85-node-pool-3dd3 Ready <none> 98s v1.20.2+k3s1
k3s-fuseml-a5eb4a85-node-pool-1da5 Ready <none> 89s v1.20.2+k3s1
k3s-fuseml-a5eb4a85-node-pool-0586 Ready <none> 85s v1.20.2+k3s1
It's time to get the latest FuseML installer, luckly for you the FuseML team has developed a utility script for this:
#> curl -sfL https://fuseml.github.io/in/installer.ps1 | sh -
Welcome to FuseML downloader...
starting download...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 643 100 643 0 0 2041 0 --:--:-- --:--:-- --:--:-- 2041
100 9.8M 100 9.8M 0 0 9186k 0 0:00:01 0:00:01 --:--:-- 31.9M
Moving things at their place...
Done.. you may start using Fuseml with: fuseml-installer -h
Let's check the version to be sure is the latest:
#l> fuseml-installer version
✔️ Fuseml Installer
Version: v0.2
GitCommit: f238153d
Build Date: 2021-09-08T09:32:36Z
Go Version: go1.16.7
Compiler: gc
Platform: linux/amd64
Everything seems to be ready so let's install FuseML:
#> fuseml-installer install
🚢 FuseML installing...
Configuration...
🧭 system_domain:
🧭 extension_repository: https://raw.githubusercontent.com/fuseml/extensions/release-0.2/installer/
🚢 Deploying Istio.....
✔️ Istio deployed
.
✔️ Created system_domain: <cluster public ip>.nip.io
🚢 Deploying Workloads...
✔️ Workloads deployed
🚢 Deploying Gitea...............................................................................................................
✔️ Gitea deployed (http://gitea.<cluster public ip>.nip.io).
🚢 Deploying Registry.................................................................................
✔️ Registry deployed
🚢 Deploying Tekton...............................................................................................................
✔️ Tekton deployed (http://tekton.<cluster public ip>.nip.io).
🚢 Deploying Core......................................................................................
✔️ FuseML core component deployed (http://fuseml-core.<cluster public ip>.nip.io).
🚢 Downloading command line client...
🚢 FuseML command line client saved as /home/snica/fuseml/fuseml.
Copy it to the location within your PATH (e.g. /usr/local/bin).
🚢 To use the FuseML CLI, you must point it to the FuseML server URL, e.g.:
export FUSEML_SERVER_URL=http://fuseml-core.<cluster public ip>.nip.io
✔️ FuseML installed.
System domain: <cluster public ip>.nip.io
Use the printed URL and try to login on the various tools. For Gitea the defaults are:
bash
username: dev
password:changeme
Run the FuseML Tutorial with MLFlow and KFServing
Following the official FuseML tutorial documented at https://fuseml.github.io/docs/v0.2/tutorials/.
Install the MLFlow and KFServing extensions:
snica@aspyre:~/fuseml> fuseml-installer version
✔️ Fuseml Installer
Version: v0.2
GitCommit: f238153d
Build Date: 2021-09-08T09:32:36Z
Go Version: go1.16.7
Compiler: gc
Platform: linux/amd64
snica@aspyre:~/fuseml> fuseml-installer extensions --add mlflow,kfserving
🚢 FuseML handling the extensions...
.
🚢 Installing extension 'mlflow'...
....
✔️ mlflow deployed.
🚢 Registering extension 'mlflow'...
🚢 Installing extension 'knative'...
...............
✔️ knative deployed.
🚢 Registering extension 'knative'...
🚢 Installing extension 'cert-manager'...
........
✔️ cert-manager deployed.
🚢 Registering extension 'cert-manager'...
🚢 Installing extension 'kfserving'...
............
✔️ kfserving deployed.
🚢 Registering extension 'kfserving'...
NOTE Why I need those extensions? If our examples we demonstrate how to run a complete ML project from experimentation up to training the model to serve it so that we may make predictions. MLFlow and KfServing are needed for both evaluate the training of the model and to expose the model as restAPI service.
Set up and check FuseML CLI access. When we installed fuseml a copy of the CLI was saved on your current directory, let's move it into the path.
#> export FUSEML_SERVER_URL=http://fuseml-core.212.2.240.210.nip.io
#> sudo cp fuseml /usr/local/bin
#> fuseml version
---
client:
version: v0.2
gitCommit: 99a8ee08
buildDate: 2021-09-08T09:34:13Z
goVersion: go1.16.7
compiler: gc
platform: linux/amd64
server:
version: v0.2
gitcommit: 99a8ee08
builddate: 2021-09-08T09:28:11Z
golangversion: go1.16.7
golangcompiler: gc
platform: linux/amd64
Fetch the FuseML examples code from our repositories:
#> git clone --depth 1 -b release-0.2 https://github.com/fuseml/examples.git
Cloning into 'examples'...
remote: Enumerating objects: 28, done.
remote: Counting objects: 100% (28/28), done.
remote: Compressing objects: 100% (24/24), done.
remote: Total 28 (delta 0), reused 22 (delta 0), pack-reused 0
Receiving objects: 100% (28/28), 84.46 KiB | 626.00 KiB/s, done.
Now let's move into the new directory
#> cd examples
Register the MLFlow project as a codeset:
NOTE: a codeset is a set of code that data scientists will change over time. We register this initial set of code so that later, any change/commit to the code will trigger a new training. To do so Gitea will be very handy because we may work directly there as in any other Git repo.
#:~/examples> fuseml codeset register --name "mlflow-test" --project "mlflow-project-01" codesets/mlflow/sklearn
2021/09/08 21:06:24 Pushing the code to the git repository...
Codeset http://gitea.2<cluster public ip>.nip.io/mlflow-project-01/mlflow-test.git successfully registered
Saving new username into config file as current username.
Setting mlflow-test as current codeset.
Setting mlflow-project-01 as current project.
Let's check that the code has been registered and can be accessed in the Gitea UI:
Everything looks good so let's configure the end-to-end workflow provided as an example:
#:~/fuseml/examples> fuseml workflow create workflows/mlflow-e2e.yaml
Workflow "mlflow-e2e" successfully created
better to double check and see how simple is to use FuseML, so let's see the configuration of the newly workflow.
NOTE: a workflow is an agnostic definition of the pipeline we will run, where we indicate the steps to follow (i.e.: train, prediction)
#:~/fuseml/examples> fuseml workflow get -n mlflow-e2e
Name: mlflow-e2e
Created: 2021-09-08T19:09:05Z
Description: End-to-end pipeline template that takes in an MLFlow compatible codeset,
runs the MLFlow project to train a model, then creates a KFServing prediction
service that can be used to run predictions against the model."
⚓ Inputs
NAME TYPE DESCRIPTION DEFAULT
∙ mlflow-codeset codeset an MLFlow compatible codeset ---
∙ predictor string type of predictor engine auto
📝 Outputs
NAME TYPE DESCRIPTION
∙ prediction-url string The URL where the exposed prediction service endpoint can b...
🦶 Steps
NAME IMAGE
∙ builder ghcr.io/fuseml/mlflow-builder:v0.2
∙ trainer {{ steps.builder.outputs.image }}
∙ predictor ghcr.io/fuseml/kfserving-predictor:0.2
⛩ Workflow Runs
No workflow runs
It's time to run everything, let's assign the codeset to the workflow, which will trigger a workflow run:
#:~/fuseml/examples> fuseml workflow assign --name mlflow-e2e --codeset-name mlflow-test --codeset-project mlflow-project-01
Workflow "mlflow-e2e" assigned to codeset "mlflow-project-01/mlflow-test"
As #mlops we want for sure monitor the workflow run while it's running:
#:~/fuseml/examples> fuseml workflow list-runs --name mlflow-e2e
+--------------------------------------------+------------+----------------+----------+---------+
| NAME | WORKFLOW | STARTED | DURATION | STATUS |
+--------------------------------------------+------------+----------------+----------+---------+
| fuseml-mlflow-project-01-mlflow-test-lhzm8 | mlflow-e2e | 11 seconds ago | --- | Running |
+--------------------------------------------+------------+----------------+----------+---------+
#:~/fuseml/examples> fuseml workflow list-runs --name mlflow-e2e --format yaml
---
- name: fuseml-mlflow-project-01-mlflow-test-lhzm8
workflowref: mlflow-e2e
inputs:
- input:
name: mlflow-codeset
description: an MLFlow compatible codeset
type: codeset
default: null
labels: []
value: http://gitea.212.2.240.210.nip.io/mlflow-project-01/mlflow-test.git:main
- input:
name: predictor
description: type of predictor engine
type: string
default: auto
labels: []
value: auto
outputs:
- output:
name: prediction-url
description: The URL where the exposed prediction service endpoint can be contacted to run predictions.
type: string
value: ""
starttime: 2021-09-08T19:10:52Z
completiontime: 0001-01-01T00:00:00Z
status: Running
url: "http://tekton.212.2.240.210.nip.io/#/namespaces/fuseml-workloads/pipelineruns/fuseml-mlflow-project-01-mlflow-test-lhzm8"
Maybe even check what is happening in the Tekton UI:
MLFlow is used as an experiment tracking and model store. Model training results can also be accessed using the MLFlow UI:
When the workflow completes successfully, the CLI will show it as Succeeded
:
#:~/fuseml/examples> fuseml workflow list-runs --name mlflow-e2e
+--------------------------------------------+------------+----------------+------------+-----------+
| NAME | WORKFLOW | STARTED | DURATION | STATUS |
+--------------------------------------------+------------+----------------+------------+-----------+
| fuseml-mlflow-project-01-mlflow-test-lhzm8 | mlflow-e2e | 13 minutes ago | 11 minutes | Succeeded |
+--------------------------------------------+------------+----------------+------------+-----------+
And in the Tekton UI:
Retrieve the URL for the prediction service started by the workflow:
#:~/fuseml/examples> fuseml application list
+-------------------------------+-----------+----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------+------------+
| NAME | TYPE | DESCRIPTION | URL | WORKFLOW |
+-------------------------------+-----------+----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------+------------+
| mlflow-project-01-mlflow-test | predictor | Application generated by mlflow-e2e workflow | http://mlflow-project-01-mlflow-test.fuseml-workloads.212.2.240.210.nip.io/v2/models/mlflow-project-01-mlflow-test/infer | mlflow-e2e |
+-------------------------------+-----------+----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------+------------+
Finally is time to test the results. Let's start trying to make a prediction against the inference service:
#:~/fuseml/examples> export PREDICTOR_URL=$(fuseml application list --format json | jq -r ".[0].url")
#:~/fuseml/examples> curl -d @prediction/data-sklearn.json $PREDICTOR_URL | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 426 100 235 100 191 677 550 --:--:-- --:--:-- --:--:-- 1227
{
"model_name": "mlflow-project-01-mlflow-test",
"model_version": null,
"id": "9861d5fc-b8e5-4c5c-82c6-e368a614e16d",
"parameters": null,
"outputs": [
{
"name": "predict",
"shape": [
1
],
"datatype": "FP32",
"parameters": null,
"data": [
6.486344809506676
]
}
]
}
Or even better why not deploy the optional web application?
#:~/fuseml/examples> kubectl apply -f webapps/winery/service.yaml
service.serving.knative.dev/winery created
#:~/fuseml/examples> kubectl get ksvc -n fuseml-workloads winery
NAME URL LATESTCREATED LATESTREADY READY REASON
winery http://winery.fuseml-workloads.212.2.240.210.nip.io winery-00001 winery-00001 True
We may now access and use the web application to make predictions:
An that's it, end-to-end predictions of wine quality through FuseML. If you want to contribute or even just help growing the community the first steps is super easier, just add a star to our repo here.