cert-manager can do SPIFFE? Solving multi-cloud workload identity using a de facto standard tool
Speaker: Ashley Davis
Summary
Ashley Davis, Senior Software Engineer and Maintainer of cert-manager, discusses the capabilities of cert-manager, an easy way to manage certificates in Kubernetes clusters. Ashley highlights the importance of Trust-manager for managing trust bundles, enabling clients to verify certificate legitimacy. Additionally, he explores the potential of using x509 certificates as a universal identity control plane in distributed systems through the concept of "SPIFFE" (Secure Production Identity Framework For Everyone).
Transcription
So yeah, hello. Welcome to "Cert Manager Can Do SPIFFE?" I've added a subtitle here. This is going to vary a bit from maybe what the description said. The reason for that is that there are two people listed on here. One of them was meant to be doing this, and that one isn't me.
Unfortunately, Tom went on holiday to Cuba last year, which he didn't realize makes you ineligible for a US ESTA. So, he couldn't be here, he couldn't get a visa in time. So, you get me, US approved. I'm Ash. I'm a Senior Software Engineer, I'm also a maintainer on Cert Manager, which is handy because I'll be talking about Cert Manager.
I've added all the ways you can contact me. Maybe if you'd like to contact Tom and say, "Sorry, you couldn't be here," I'm sure he'd appreciate that. But you can reach out to me too on all the usual things. I'm happy to talk about anything. I'm also happy to talk after this. I really like this little venue, and it'd be nice to stand out there for a bit.
I've had to change a fair bit about what Tom had originally. He conceived this idea, he got all of this submitted and all that stuff. But he had a lot of it relevant to him. So he had this joke, "Back when he had hair," you can see he no longer does. I, on the other hand, I'm clinging on to still having hair. I've only got a couple of years left and then I won't have a choice anymore. So, the joke didn't work. Some things had to change here.
What didn't change is that it's about Cert Manager, by and large, right? I wasn't going to say, "Who here knows Cert Manager?" But since there's a few, only a few people who hear, "No, Cert Manager," anyone give me a hands up? Yeah, everyone knows Cert Manager, great! It's always nice to see that.
So, it definitely has quite a lot of users. Like, we see that people treat Cert Manager like a standard part of Kubernetes when they set up a cluster. That's a great feeling. It means that you know it's solving a problem that people actually have.
I was going to do a really quick overview to make sure we're all on the same page. I'll still do that, but I'm going to rocket right through it since literally everyone raised their hand. Well, first of all, Cert Manager is the easiest way to manage certificates in a Kubernetes cluster. This is what we put on every release that we make. It's a change that I made a few releases ago.
The reason that is number one, this is like the thing that we abide by. This is what we're trying to do. This is the problem we're trying to solve. The other part of it is, if you share a release, a GitHub release on social media, the preview image that GitHub generates will include the first line. So it's kind of a hack if you're releasing open-source software to have a nice intro that describes the project.
If you know Cert Manager, then you'll know certificates are used for TLS and SSL, if you want to call it what it was called a while ago, and most things need it, right? That's why Cert Manager is so popularly used. It's because if you're running a public-facing service, or if you're running anything else, you probably need a certificate to do your thing, and Cert Manager helps with that.
It's also worth pointing out that, as of fairly recently, as of the end of last year, we're a CNCF incubating project. We were in the sandbox for a while, we're moving on up. We solve a fairly unique problem in the cloud native landscape. So, Cert Manager kind of stands alone in what it does, and it's really cool to be part of the CNCF, and they provide a lot of support.
Again, I'll rocket through this, but the basics are Cert Manager adds CRDs for certificate-related things. The core selling point, if you can have a selling point for an open-source project, is that it does certificates in the way you would expect if you're used to Kubernetes. If you've written a pod or a deployment, you can write an issuer or cluster issuer, and you can write a certificate. So your issuer says how to get a certificate, and your certificate resource says what you want in it. So this is an example for example.com.
Your issuer, you can see, is referred to in an issuer ref. Well, there are plenty of issuers that you can use. You don't need to use the CA issuer that I just showed. You could use, actually called, Vault, ACME, or Let's Encrypt, Venafi, TLS Protect, which is my employer. There are many options for this and you can write your own as well. You're not limited to just the ones that are in core. There's an external thing, and you can do your own if you feel so inclined, and we fully support that.
I'd highlight as well, a lesser used thing is that Cert Manager can totally issue CA certificates as well. Most people, we understand, we don't have telemetry or anything so we don't know for sure, but most people use Cert Manager to get Let's Encrypt certificates for services. But you can do private PKI in Cert Manager today. If you're running it, you can already do this, and actually, that's kind of cool and underused.
So, like, if you're using Cert Manager for Let's Encrypt, then you have to deal with the Let's Encrypt rate limit. Maybe that's not like a huge concern for you, but as scale grows, the rate limits start to mean more. And that's by no means a ding on Let's Encrypt. Let's Encrypt is one of the best things on the internet, but it's a problem you have to think about. So, running private PKI, at least for your test environment, is actually probably a pretty interesting thing to consider.
It also lets you do other kinds of certificates, which is an interesting proposition. We'll come back to. But this talk isn't just about Cert Manager itself. I'd like to draw attention to the Cert Manager and Friends diagram that I made here, very quickly, and done on a very low budget. Again, I was drafted in fairly late into the process here.
But it's also about other Cert Manager stuff, right? It's not just Cert Manager that we do. It's a GitHub organization with other sub-projects that do other things that are useful to you if you have certificates in your cluster. Probably at least one of these projects is going to be useful today if you're not running it.
I'm going to talk about a couple of interesting ones, but there's more that I won't mention. There'll be a link later if you wanted to check out any more. The first is Trust Manager. And this one is by far, I think, the most important one. If there's anything that you take away from anything I've said to date, you should probably be giving Trust Manager a go, especially if you're running Cert Manager.
Trust Manager, I genuinely think it could be the next Cert Manager. It's solving a problem that some people don't even know they have. Trust me when I say, I feel really weird quoting myself, but getting a certificate is only half the problem that you have. So, you can use Cert Manager, you get the certificate you need, and that's great. Your server can present it to clients. That all works. That's what many people do today.
But the question you don't often answer is, how does the client know that your certificate was legitimate or valid? And that's the problem of trust. We manage it using trust bundles. It's like a blob of certificate data that describes what you trust. All of your devices have one, this laptop does, my phone, your phones, all of your containers have one too, most likely, if you're not using Trust Manager or some other solution. And usually, the way they have it is, you bake it in at build time. So, when you build your container, you build from Debian or Ubuntu or whatever, and you add a CA certificates package. That means that to update it, your time to update is actually the time it takes to rebuild your container from scratch.
And that's maybe a little bit scary. I'll come back to that, but essentially what I'm saying is, Trust Manager is like Cert Manager, but for trust, as you can probably guess from the name. So, you specify a Bundle resource this time, and you'll get a ConfigMap out that has the bundle, the trust bundle, that you need.
Here's an example, again, it's just YAML. You specify a list of sources, which tell Trust Manager where to get your certificates from, and then a Target, which is a Config Map where everything will be written to. I'd especially like to highlight the Use Default CAs thing here, because this is new, very new. Like, it was released on Friday kind of new. This is the first time I'm publicly speaking about it, so I guess this is an announcement.
But essentially, this is the ability to include a publicly trusted bundle in your Trust Manager bundle. So that's the kind of thing that you're getting if you're building from Debian or DigitalOcean or anything like that. So, this is the thing that enables use of Trust Manager anywhere.
The key really is that we've spent a lot of time thinking about how to design this such that updates of this are always safe. So, as of Trust Manager 0.4.0, which released on Friday, any trust bundle we publish ever in the future with any updates that are required, you know it's all coming from Debian ultimately, will work with any version of Trust Manager. We've built it in that way because it needs to be built that way. You need to be able to update this stuff, and you don't want to get to a situation where you need to update your trust bundle in a hurry and then realize that you have to update Trust Manager as well. Like, you can just use the version you're running, even if it's out of support, even if it's years old, and you'll always be able to update. And by doing that, you don't need to rebuild your entire container estate. If you're already using Trust Manager everywhere, you can just mount, you're mounting the Config Map that comes out of Trust Manager, you don't need to rebuild a thing. You just update the bundle. You might need to redeploy things to pick up the new trust bundle, but actually, your time to Disaster Recovery is minimal in that case.
I think a lot of organizations don't realize the risk that they're in at the moment with trust bundles baked into containers. There's been upstream work on this and we'll try and integrate with that as well. But I guess it's a warning. Like, oh my God, why are people not talking about this more? But aside from all the doom and gloom, I'm not trying to be too doom and gloom about it. Trust Manager also enables private PKI, and that's cool as well. I've said that Cert Manager has support for it. So does Trust Manager. It's solving problems that you might not even know you have, that's the publicly trusted bundle thing.
But if there's one thing that you derive from anything I say, try Trust Manager. Like, it really is pretty easy to use, and the more people use it, obviously the better it'll be. The same has happened with Cert Manager. Cert Manager is battle-tested now, and it's come from user feedback. Trust Manager can be the same thing, and I genuinely think it is important.
So, stepping away from the world of Cert Manager just a little bit, we'll look at a story, and this is a slightly different tack to what I've been talking about so far. Again, I'd like to highlight the image that comes with it. This is AI-generated because every talk has to have something like that. I believe Tom's prompt for this was a developer dealing with keys. I kind of like it.
So, imagine a scenario, I'm sure everyone's been here or done, a fairly simple application that needs to talk to object storage in some way. That could be S3, it could be GCS, it could be Civo, of course.
We've actually, if you've been here for the past few talks, a lot of people have been talking about this identity problem and this access control problem, and all that. The simple solution to this is Kubernetes Secrets. And that works fine if you're running on your laptop or a small deployment. That works fine if you're running in the cloud. If I'm running, say, in the USA and I need to access S3, I've probably got some role that I can leverage to access that. Like my pod will have an IAM role, and I can get S3 access through that.
And sometimes that's fine. Again, I don't want to be the Doom and Gloom person that's like "Oh my God, you need to worry." In small deployments, that's totally fine. If it works for you, great. But things get more complicated than that, especially when you're talking multi-cluster, and multi-cloud, or hybrid infrastructure, where you've got on-prem and cloud stuff.
And especially if you're looking at things like Zero Trust that's been mentioned recently. If you need to rotate secrets quickly, or have short-lived secrets, or whatever you do that might lead to more secrets being created, it all sort of proliferates and you end up with more of everything, which ultimately leads to more keys. Which leads to that AI generated image of a guy worrying about keys.
So maybe we need something else here. And I'm not here to present the one true way to solve everything. I've certainly not built that. Cert Manager hasn't built that. But it's worth thinking about. I'd like to plant the seed of the possibilities here. If we get away from the idea of any one cloud provider's identity system and sort of have something that maps over the top of it all, we can maybe do some interesting stuff here.
An x509 can be a building block here, that's the certificates we use today, but it doesn't fully solve the problem because if I've got a certificate that identifies example.com, that's not necessarily a meaningful identity that I can use to make authorization decisions with.
So people are thinking about this, and one of those groups is SPIFFE, which is the other thing that was mentioned in the title. So what is SPIFFE? We should probably dive into that again. Since there's so few of us, who's heard of SPIFFE? Again, lots of hands. That's great.
SPIFFE describes itself as a "universal identity control plane for your distributed systems." By that, they mean it's a specification for getting and using identities. Specifically, handling the sort of grubby bits of it so rotating them, replacing them, how you trust them. And it uses x509 natively. I'm not going to talk much about JWT because Cert Manager doesn't deal with those, and they're kind of strictly worse than x509 certificates.
Well, the idea of SPIFFE is that it aims to be a universal identity. It aims to do that thing that I was just talking about where we have this thing that's abstracted from any cloud provider. And this is what it looks like. Or at least this is what an ID looks like. It's not the whole specification obviously.
You have some trust domain, which is probably your organization. It could be example.com. It could be Jetstack.io. It could be Civo, of course. And then after that, you can sort of embed any details you like. It's just an identity. Here we're using a Kubernetes namespace and a Service Account. It could be anything.
Specifically, if we embed this into an x509 certificate, then that's an identity we can use. We could talk about mTLS or other systems, but it's a building block we can start with. And actually, the potential here lies in the universality. If this could apply in more places, the power increases exponentially. The more places you can use your SPIFFE ID, once you've got one, the more power you have.
It's not there yet. I don't want to say that I'm selling a complete solution. I'm not selling anything obviously. But I don't want to say this is a complete solution that you can run with today because it isn't. Frankly speaking, it isn't. But it is really worth thinking about.
If we got to a future where SPIFFE, or some equivalent standard, enabled some universal identity, and cloud providers supported that, it would be really cool. Because then this problem of secret management becomes, well, we have this identity. In SPIFFE, they call it the "bottom turtle," the turtle at the bottom of the stack of turtles that allows us to do everything else.
If we get to that world, then we can just fetch these secrets transparently. There's an internal Jetstack demo which I'm not going to do. I'm not going to try and do a demo. I'm not going to do that to myself when I'm jet lagged and last minute. But there's an internal Jetstack demo of transparently accessing an S3 bucket from Google Cloud using SPIFFE. But you don't need to do anything else. You don't need to change your application code. You just go out and you can talk to S3 using credentials for an entirely different cloud provider.
And that works because you exchange for a SPIFFE ID and it works going forward. It's cool. It's a cool thing to think about, and it's a cool thing to play with. Which leads nicely into the CSI Driver SPIFFE, which is another Cert Manager sub project. This is the quote that we use for that. I wanted to have a quote for every section.
It transparently delivers SPIFFE identities to Kubernetes pods that mount this particular CSI driver, by which we mean it's keeping things simple. So, you just ask for a CSI driver SPIFFE mount, and that's what you get. You don't need to run anything else. SPIFFE has a reference implementation, that's a tricky word to say when you're jet-lagged, implementation, called SPIRE. SPIRE is great and does everything, it does the whole SPIFFE spec. But it requires a database, it requires running something that you might not be running today. CSI driver SPIFFE doesn't need that. It needs cert manager, and it needs to be installed, but you can kind of play with it today and see what happens. Maybe start to build something around this and see where it goes. Again, I'm talking about the possibilities that could come, like it's worth exploring this. And again, it's just X.509 certs. As I said I'm not talking about JWT specifically, but like X.509 can do this stuff, and the technology is there today. So it's, it's worth experimenting with.
SPIFFE seem to think so because we're on the spiffe.io website. Like you can see Cert Manager CSI driver over there, um. That's a cool shout out, it's a sign that we're doing something that's worth considering, at least right? And that's all I'm saying. Consider this, if you consider Cert Manager can do CA certificates, and Trust Manager can distribute them to your Kubernetes namespaces that you have already, at CSI drivers SPIFFE you can get the identities that you need based on your namespace or your service account. And together, that kind of looks a lot like SPIFFE, without needing the database and the thinking about it, it's quite a simple way to get started.
And the obvious question is, well, is that SPIFFE? And it isn't SPIFFE. Like I said, it's hard to say, but it's definitely not. Like the SPIFFE spec is much bigger than this, and includes a gRPC API, and you actually probably do need that database. But it's something to think about, and even if you don't use all of this to do SPIFFE, well, it's still useful stuff. Cert Manager, people still using it today, obviously. Trust Manager is something that people should be using, and CSI driver SPIFFE is at least planting the seed that we can think about, and see what happens with it.
So, in summary, as I said before, Cert Manager is not just Cert Manager. A lot of people are using it, less people are using the sub-projects, and they have a lot of value. They incorporate a lot of learning that we had as a project, as a Cert Manager group. We still care about Cert Manager, like, I'm not announcing that we don't care about it or anything, like, it's a huge deal. But we care about the other stuff too, and we think there's a lot of potential there. And it's all open source, and that's the manager, obviously, is part of the CNCF, so it's stuff that you can go out and use. And if anything, go and try Trust Manager because there really is a lot of potential there, and if you think about how trust stores are updated, it scares me to think about the kind of deployments that are out there with that kind of problem in.
I've got a warning, if you want, if you want to scan a QR code, now's the time to get your phone and point it at the, at the screen. If you don't want to do that, because no one's pointing their phone at the screen, I would implore you to, if you use their money at all, if you've heard of it, if you, if you have a GitHub account, please go on to the Cert Manager repo, and drop us a star because it now displays as us having 10K stars. We've actually got 9,500 or something when I checked just before this, and our collective egos as maintainers requires that we get to that 10K mark, because it'll be really cool, and like, so please, if you could, go and star it, please do. Yeah, if you wanted to scan the QR code, there you go. If you, if you, um, don't want to scan it, then there's a link below.
Essentially, all of the documentation for any of these projects is on the Cert Manager website. We try and keep everything there. So if you want to see what other ones we have, what other sub-projects we have, or if you want to read more about Trust Manager or CSI driver SPIFFE, go there. And also, the docs website is, it takes a lot of work to run that, so go and just check it out.
Thank you all for being here.I hope it didn't seem too last minute, with all the last minuteness that went into it. Please, go and try Trust Manager. Please, keep enjoying Cert Manager, and thank you for listening. If anyone has any questions, I'll be more than happy to chat or go chat out there or anything. So, thank you very much.
Stay up to date
Sign up to the Navigate mailing list and stay in the loop with all the latest updates and news about the event.