What is Chaos Mesh®?
Chaos Mesh® is an easy-to-use, open-source, cloud-native Chaos Engineering platform that orchestrates chaos experiments in Kubernetes environments. Chaos Mesh focuses on reducing the cost of conducting chaos experiments.
Pre-requisites
First, a Kubernetes cluster is required! If you don’t have one yet, you can take advantage of Civo's super-fast managed k3s service to quickly create one. If you want a walk-through, a video on starting a Kubernetes cluster is available.
Once you have a cluster running, make sure you have the KUBECONFIG set and pointing to this cluster. For more detail, refer to "your cluster kubeconfig".
How to Install Chaos Mesh from Civo Kubernetes Marketplace
Chaos Mesh has already been added to Civo’s Kubernetes Marketplace. You can find Chaos Mesh under the “CI/CD” category, simply refer to “Deploying Applications Through the Civo Kubernetes Marketplace” to install directly to your cluster.
Note: If your Kubernetes cluster is ready and you prefer to do so, you also can install Chaos Mesh manually. For more details, you can refer to “Install Chaos Mesh manually” on the Chaos Mesh website.
Accessing your Chaos Dashboard
Chaos Dashboard is a Chaos Mesh component, and a one-stop web interface for users to orchestrate chaos experiments. You can define the scope of the chaos experiment, specify the type of chaos injection, define scheduling rules, and observe the results of the chaos experiment – all in the same web interface with only a few clicks.
To access Chaos Mesh on the web interface, there are two options:
-
to create an ingress in Chaos Dashboard;
-
or to make a port-forward with kubectl as follows:
kubectl port-forward -n chaos-testing svc/chaos-dashboard 2333:2333
Now you can access Chaos Dashboard by accessing localhost:2333 on your web browser.
Note: Chaos Dashboard supports a security mode, which requires users to log in with a token generated by Kubernetes. Each token is linked to a service account
. You can only perform operations within the scope as allowed by the role that is associated with the service account. By default, the security mode is enabled when using helm to install Chaos Mesh. You can refer to this documentation to create the account and the token.
If you would like to create the account later and start using Chaos Dashboard directly, you can use the token of Chaos Mesh. Get the token by executing the following command on your terminal:
kubectl -n chaos-testing describe secret $(kubectl -n chaos-testing get secret | grep chaos-controller-manager | awk '{print $1}')
Deploy the application for testing
The next step is to deploy the application for testing. In our case here, we use a demo application called a web-show, which allows us to directly observe the effect of NetworkChaos. You can also deploy your own application for testing.
To acquire the demo application, download it as follows:
curl -sSL https://mirrors.chaos-mesh.org/latest/web-show/deploy.sh | sh
This should run the deployment script and apply the application to your active cluster.
To access the web-show application, go to http://localhost:8081 from your web browser.
Creating a chaos experiment
Now that everything is ready, it's time to run your first chaos experiment!
Chaos Mesh uses CustomResourceDefinitions (CRD) to define chaos experiments. CRD objects are designed separately based on different experiment scenarios, which greatly simplifies the definition of CRD objects. Currently, CRD objects that have been implemented in Chaos Mesh include PodChaos, NetworkChaos, IOChaos, StressChaos, DNSChaos, TimeChaos, AWSChaos, and KernelChaos. Chaos Mesh will support more fault injection types in the future.
In this experiment, we are using NetworkChaos as the chaos injection type.
Note: you also can use Chaos Dashboard to create your own chaos experiment. For more details, please refer to “Create a chaos experiment by Chaos Dashboard”
The NetworkChaos configuration file, written in YAML, is shown below:
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
name: network-delay-example
spec:
action: delay
mode: one
selector:
namespaces:
- default
labelSelectors:
"app": "web-show"
delay:
latency: "10ms"
correlation: "100"
jitter: "0ms"
duration: "30s"
scheduler:
cron: "@every 60s"
For detailed descriptions of NetworkChaos actions, check out its documentation. Here, we will simply rephrase the configuration as:
-
target: web-show
-
mission: inject a 10ms network delay every 60s
-
attack duration: 30s each time
To start the NetworkChaos, take the following steps:
-
Apply
network-delay.yaml
from the Chaos Mesh repository:curl -sSL https://raw.githubusercontent.com/chaos-mesh/chaos-mesh/master/examples/web-show/network-delay.yaml | kubectl apply -f -
-
In your web browser, go to http://localhost:8081 to access the web-show application.
From the line graph, you can tell that there is a 10 ms network delay every 60 seconds.
Congratulations! You just stirred up a little bit of chaos. If you are intrigued and want to try out more chaos experiments with Chaos Mesh, check out examples/web-show.
Deleting a chaos experiment
Once you're finished with the testing, terminate the chaos experiment:
-
Delete network-delay.yaml:
curl -sSL https://raw.githubusercontent.com/chaos-mesh/chaos-mesh/master/examples/web-show/network-delay.yaml | kubectl delete -f -
-
Access the web-show application. From your web browser, go to http://localhost:8081.
Conclusion
Congratulations on your first successful journey into Chaos Engineering. How does it feel? Chaos Engineering is easy, right? Chaos Mesh is a very easy-to-use and cool tool for users to find hidden bugs in the system. In a matter of several minutes, you can deploy it on your cluster and start creating chaos, so that your Kubernetes applications are ready for any kind of failure.
If you find a bug or feel something is missing, feel free to file an issue or open a pull request (PR) in the Chaos Mesh repo, or join us on the #project-chaos-mesh channel in the CNCF Slack workspace.
Let us know on Twitter @Civocloud and @chaos_mesh if you try Chaos Mesh® on Civo!