Orchestrating Kubernetes with Terraform + Kustomize
Speaker: Jack Ross
Summary
Jack Ross, Principal Software Development Engineer at Shutterfly, explains the use of Terraform and Kustomize for orchestrating Kubernetes. He distinguishes between Infrastructure as Code and Configuration as Code, outlining their respective benefits. Ross highlights how Terraform interacts with Civo and contrasts the functionality of Helm and Kustomize. The session wraps up with practical code examples illustrating the principles of these tools.
Transcription
My presentation today is orchestrating Kubernetes with Terraform and Kustomize. Feel free to scan the QR code. It has my contact info as well as a reference to the link at the bottom. Got all the slides in as well as the code I'm going to be referencing. I'm a Principal Software Development Engineer Shutterfly and I work in the DevOps department. There's a QR code but bigger in case it helps anyone. Oh, wait a second.
Sorry, go ahead.
So, I originally wanted to talk more about Terraform but there's a ton of talks tomorrow about Infrastructure as Code and or Terraform. So, feel free to go to those. I'm going to briefly touch on it but try and focus more on Kustomize. It seemed like Terraform is pretty well covered but there are a lot of gaps with Kustomize directly.
So what is Infrastructure as Code? Essentially it's describing infrastructure using code. You manage and provision infrastructure using software and automated processes. You store your templates and version it in repositories, and sometimes it's even right alongside with application code. And there are a bunch of tools like below, where you use programming language like TypeScript, Python, Go, C Sharp, or Java. Kind of make it even more like code because you're actually writing encode. So there's a bunch of tools for implementing Infrastructure as Code. Some are platform specific like AWS, Azure, and Google. They all have their own versions. Terraform is probably the most popular. And there's also the CDK for Terraform, where you basically write in programming languages but then it compiles to HCL, which is Terraforms HashiCorp Config Language.
It also allows you to write in programming languages but since it doesn't have to be machine readable, it's a little bit more flexible compared to CDK for Terraform. And there's also Crossplane, where you kind of manage the infrastructure using Kubernetes manifest. So, you're kind of managing everything with manifests. There's also Configuration as Code. It's very frequently intertwined with Infrastructure as Code. You actually are configuring your applications and servers with code rather than directly with the infrastructure itself. And it can get really confusing because a lot of technologies also let you write infrastructure using them. Ansible, primarily like I know for a fact that you can write EC2 servers using Ansible. But there's also tools like Puppet, Chef, Salt Stack. So here's kind of a thought experiment. Let's say that for some reason there's a big disaster and you completely lose access to your current environment. How long would it take you to faithfully reproduce your environment? Would you be able to without being able to look at what you had deployed or would it be a matter of guessing and trying to do best case? And how long would it take to do that? If you have your Infrastructure as Code in place, that's a relatively quick process otherwise it might be a bit of a struggle.
And there's a lot of benefits of using Infrastructure as Code. You get automated deployments, a lot of repeatability and consistency. You're not having to have someone on their computer manually run it, and then their computer upgrades and the tool's broken. Or maybe someone else is doing a deployment for the first time. Which also helps reduce risks because there's not that unknown factor of, "Well, it works on my machine, why is it not working on yours?" And it also helps make things faster, so you can deploy faster.
You also have it in version control, so you're able to revert to previous versions. There's accountability. You can see who changed what and exactly when they changed, and hopefully why, if they have good commit messages. And you can do things like check the infrastructure state two weeks ago, which may not be possible unless you're digging through you know, something like CloudWatch logs, to see who changed what and it's a big mess.
So development teams can also provision their own resources. A lot of the times, you can use Terraform modules, where you're basically creating "blessed" modules, where you know, it might be like a bucket. But you have all the security aspects in place. You have your blocking public access, those kind of things. Which also helps development teams not be bottlenecked by DevOps. They can do a lot more on their own and basically just getting, you know, kind of like a final sign-off.
So, with Civo and Terraform, there's a bunch of resources that are manageable with Terraform when you use it with Civo. The ones we're going to kind of talk about today is the Kubernetes cluster and some of the networking stuff like firewalls. So Helm, it's often called a package manager for Kubernetes. It's used to package and share Kubernetes applications. It lets you combine numerous manifests into a single unit, which makes it really powerful for importing and exporting applications. There's a large list of some of the more popular applications that you can get a Helm chart for to deploy.
So now, for Kustomize. It basically allows you to patch Kubernetes manifest. You use a layering approach to preserve the original manifest, and then you're essentially selectively overwriting some of the default settings. You're using plain YAML, so you're not having to use any kind of templates or anything. And it's very powerful for configuring differences in environments that utilize the same templates.
So there's a number of benefits of using Kustomize over Helm. It's easier to learn. It's simpler. Again, no templating. It's part of kubectl so you don't necessarily have to install anything. You might want to if you're looking for a different version. And it's also managed by the Kubernetes team directly. It can help keep your code DRY, which is the "Don't Repeat Yourself" principle. If you're using components, it lets you apply the same set of changes to a number of different environments or templates. With Helm, you just get a single Helm chart. With Kustomize, you're gaining increased visibility and transparency. Because you actually have the manifest there, you're not having to dig through what the Helm charts contain. It also scales better due to their inheritance-based models. Unless you have a huge number of variants, a little more easier compared to Helm.
So why not both? So frequently, Kustomize used for configuring differences between the environments. But another thing you can do is even reference Helm charts from within Kustomize. They have a resource called Helm charts. And then, using Helm for importing third-party applications, you can even download the manifests there, manage it in Helm, and work directly with the Kubernetes Manifest. And then, Helm is also great if you're sharing with third parties. Maybe you have other people who want to use your manifest. It's a great way to easily share it to them.
So, with Kustomize, there are some terms I want to go over quick, so I'm not just saying a bunch of stuff that people might not understand. The first is the concept of a Base Environment or Shared Template.
It's basically a collection of Kubernetes manifests that define the resources in each environment. So in the example I'll show, for my example, it's just a very simple Nginx server. So you have an Nginx namespace, a deployment and a service. You have overlays which is how you're defining the differences between your environments which is primarily a ConfigMap. So some examples might be the name of the environment, the number of Nginx replicas to deploy in that deployment. And you also have components which is the set of patches that you're basically applying to all of the base templates. So that kind of ties the environment overlays with the generic base templates. An example of that is taking the number of Nginx replicas from an environment overlay and patching the base template so that the correct number of replicas are deployed. It'll probably be easier when you look at the code to understand these but I figure I'd throw them out.
And here's a diagram that kind of shows how everything works together. The end state you're looking for is the Kubernetes Manifest. But you're creating these manifests based on your shared modules. I should probably say shared manifest, whatever. But then you have the environment overlays which is where all your configs live. And then the components are actually saying what should be replaced and where to grab those values from. In case that diagram helps anyone.
And now for some code. So here we're looking at a Terraform provider file. With Terraform, this is basically where you're defining what providers you want to use. In this case we're using Civo, so we have the Civo provider right there. I especially want to point this out, in the Civo docs they recommend that you put your Civo token right there. I highly recommend against it because it's very easy to accidentally commit that and then you know your Civo tokens floating around the internet for anyone to use. I think it's relatively easy to generate a new one but hopefully you catch that before.
So here's just where... This is the variable file for Terraform. So right here you can see we're requesting that they provide a Civo token there. I'm using a .tf or Terraform.tfvars file so that way I don't have to manually type in my Civo token every time I run Terraform. And then you also have your... This is the definition of environments so basically this is kind of a custom resource. You can see here I'm defining what our environment should look like. So you can see there's a name, how many nodes we want to deploy. This is some Terraform magic but um, basically selecting what values you want to use for your node size. And then also what region we want to deploy in. And to make things easy, there are some default values right here. So you can see by default it's deploying a Dev and a Production environment, giving them names and you can see that there's a different number of nodes. For Dev, we're using Extra Small, for Production we're using Small. You probably want to use something bigger if it actually was production but just to show you what it looks like.
So now we're looking at the network Terraform configuration. We're using a Civo firewall and this is interesting because we're doing a For-Each loop in Terraform. So we're basically looping over, it's a dynamic number of environments. So if you don't provide any environments it's not going to deploy anything. If you provide 500 environments, it'll deploy 500 firewalls. And then each of them are, the name of it is based on the name that you provide right here. So there would be one called Dev, one Prod if you don't change the default.
Take a look at the cluster. Here's some Terraform magic that's down here. But anyway, so we have the Civo Kubernetes cluster and again we're looping over the environments that we provide. And again using the name from that custom resource. And here you can see we're using the firewall ID. So each Kubernetes cluster is tied to a firewall that correlates with that. So again, if you provide 500 environments, you're getting 500 clusters. And then the pool is just setting up the nodes. So this is where we actually do the Terraform magic to define the size of the node.
So now I'll switch over to Kustomize. With Kustomize, the customize is always looking at customization.yaml files. If you take out this customization file, it's just using Kubernetes manifest right here. And then the customization.yaml file, you can see we're referencing the three resources that are right here. So here's the Nginx deployment resource and you can see it's just pretty much a standard Nginx deployment, nothing that interesting. Here's the namespace, again just a simple Kubernetes manifest to define the namespace. And then we have the Nginx service and again just straight Kubernetes manifest, nothing too spectacular.
But now we'll switch over to the overlays where things get a little bit different, where we differentiate the environments between each other. You can see here, again it's a customization.yaml file. So here, where this is where we reference what base files we want to use.
We're just referencing the base files that are in that directory and we're also creating a config map where we're going to store the variables that are unique to that environment. So if we jump over that config map again, right here, we're giving the name of the environment, in this case, 'Dev,' and we're saying that we want one replica. Switching back to the customization overlay, here we're using a config map generator, which is just another way of creating a config map, and we're referencing this index.html file that's in the same directory. And that's just so that we can serve up a different file for the different environments. We look at the index.html files; it's just saying, 'This is Dev,' a super exciting HTML file.
And then we reference the components, which I'll jump into in a second, and that's those collection of changes that we want to apply to for each environment.
So, here's the customization.yaml file for the components, and you can see there's two replacements that we're doing. Oh, and here's the replacement for the environment. Basically, we're getting the source from the config map, you can see right there, and we're using the value that's in 'dot dot name,' and then the target, we're going into the Nginx deployment and replacing it in this 'meta dot annotations dot environment.'
So we'll jump over to that template again, and you can see right here, by default, I just have 'not defined' here. There's a lot of arguments as to whether it's better to have like 'Dev' values in there, some people prefer 'production.' I prefer to use kind of nonsense values, so it's very obvious that you shouldn't go into the base directory and directly apply those, but you should be applying it from an overlay directory.
And we'll jump back to the replacement side replicas, and you can see here, doing the same kind of thing, looking at the config map, and referencing the 'data.replicas' this time. And basically, replacing that into the 'spec.dot replicas' this time, and you can see here the default is zero. Again, if you're running it from the base directory, which you shouldn't be doing, I'm using bad values just so it's very obvious that you're doing something wrong.
When this is the production customization file, and you can see it's pretty much exactly, well it is exactly the same as a Dev one, you don't have to make any difference between the two. The differences are all in the config map and the index.html file. I don't think we'll get to actually showing this, but the production one is just saying 'This is production,' rather than the 'Dev' one. And then the config map it's saying the environment name is 'prod' and there's three replicas rather than the one like we have with the 'Dev' environment.
You can see here, I'm in the Kubernetes overlay 'Dev' directory. If you run this from the base one, it'll build.
But again, you're getting those nonsense values, like this annotation 'not defined,' and the replicas are zero. So let's go back to the 'Dev' overlay, and again we'll run 'Kustomized Builds,' and this time, you can see that it's creating a config map. So again, the environment name is sorry, the environment name is 'Dev,' and there's one replica. If we look at the config map that was generated with the config map generator, you can see the index.html file just as 'This is Dev.' But now we look at the Nginx deployment, and you can see the annotation as the 'Dev' environment, and also one replica, rather than the base template which is one. And then switching over to the production overlay.
And you can see that things are different. In the config map, it's using a production name, there's three replicas, we're using 'This is production' for the index.html file, and we go down to the deployment and again, the annotation is 'prod,' and we also have three replicas.
So they serve a little bit of time, I'll go ahead and keep going. Okay, I'm already in 'Civo Dev,' which I have output here, is to make it easy for me, I deployed this who knows when nine days ago. So we checked the, this is for the 'Dev' environment. If we go to the external IP.
We can see that 'This is Dev' and 'This is Dev,' and then we go to 'production,' and I'll switch over to 'Prime.'
Let me clear that. Here's the production, service, and 'This is production,' so basically, makes it really easy, all the Nginx stuff is deployed using the same exact templates, which makes it really easy to just use your config map to create an overlay that goes over it. That's everything, there's the QR code, in case anyone still needs it.
Stay up to date
Sign up to the Navigate mailing list and stay in the loop with all the latest updates and news about the event.