Kubernetes at the Edge with Portainer
Speaker: Neil Cresswell
Summary
Dive into the world of Kubernetes at the Edge with Portainer with Neil Cresswell. In this talk, learn how Portainer transforms container management by making Kubernetes deployment at the edge effortless and efficient.
Through this talk, revoluntionize container management and orchestration by simplifying Kubernetes deployment at the edge. This is aimed at making deploying, updating, and maintaining Kubernetes clusters on edge devices more streamlined and accessible, benefiting industries like telecommunication, healthcare, and manufacturing where low latency and high reliability are critical.
Transcription
We're a direct layer of Kubernetes that aims to make Kubernetes and Docker relatively simple. Here we have Adolfo, SE Manager, he's going to take a risk and do a live demo, so it feels really skeptical about it. I just think for my demos, if they all backfire, it's a gamble. It's a gamble. Yeah, so I'm gonna talk for a little bit. I'll try and keep it short to leave as much time for Adolfo to do the demo, but feel free to inject and ask questions. So anyone that creates applications knows that containers and, by virtue, Kubernetes is the way that you ship applications. It's the way now. It's the given. It's the only logical way.
Is that better? Is that better? I always think that I'm a loud person and everyone can hear me naturally, but the edge is a different beast. And if you think about Kubernetes being quite complicated, Edge and Kubernetes together, you'd think would actually elevate complexity quite dramatically. But there's a number of really good reasons why Kubernetes at the edge just makes sense. I'm going to go through these. So if you think back how you used to ship applications, it was always with installers. You had MSI installers or other types of installers, and they were a nightmare. You're always fighting with library mismatches, version conflicts, all the stuff. It was an absolute nightmare. And one of the major benefits with Kubernetes is that we really have a recipe or a means to declare how we want our application to run, and that's the manifest. And the manifest really is a game-changer. And in my career, it's the first time I've ever seen a way that you have a way to basically declare how your application runs, and it runs that way everywhere.
The Kubernetes manifest at times is quite overwhelming. It truly is. But once you get your head around how it works, it is just a natural way to ship, publish your application in a way that is infinitely reproducible. No matter where you run the manifest, it'll pretty much run the same way every single time. If you think about Portainer, our own product, we ship our product as a Kubernetes manifest. Now, I don't know how all you guys have your applications or sorry, your kube classes configured, and it doesn't actually matter. You take our manifest, you run, or you apply or deploy our manifest, and Portainer will run for you in seconds. And it will run in a very predictable way. It'll deploy the same way, and we do that because we can simply use things like default storage classes and other things where we simply say inherit what's default in the cluster and run with persistence, with a load balancer, with Ingress. We inherit that, and that's one of the big benefits.
So you've embraced the awesomeness of the Kubernetes manifest, now we want to build on it. So with the same manifest, you can take your application and you can move it from one cloud to another or one cluster to another. You can scale that. You can actually scale it horizontally or vertically. You can actually move things around. You can say, I want to take this application and I want to move it from cluster one to cluster two. Or I want to run it in cluster one and cluster two. Now I want to add cluster three. These can be different geos around the world, and it really does open up some amazing opportunities. If you think about how hard it was in a virtual machine world to have full active-active data centers, dynamic data centers, it was an absolute nightmare, and Kubernetes has really unlocked that. So now, you've got the opportunity to spin up clusters anywhere in the world. You can dynamically spin up your applications. So you can say, right, I want an application running in the US during US business hours, but I want to shut it down after hours, and I want to bring it up in Asia overnight. You really have this opportunity to dynamically reconfigure. You also have the opportunity to say, I want to use lower-cost cloud services for bulk data processing and more expensive cloud providers for core systems of record. So you really have this opportunity to bounce things around.
So the Edge is an often overused term, and for the purpose of this talk, we're going to define Edge as Network Edge. And when we say Network Edge, what we're talking about here is putting applications closer to users, and we do this to try and reduce network latency, reduce bandwidth. But the whole thing there is to say, how do I get an application close to my user so they get a snappy, responsive application experience, and I pay less for bandwidth between my user and the backend system. And Edge is all about getting applications out there that aid in things like ingesting, so sorting and filtering. If you think about logs, why send thousands and thousands of logs from an edge location back to the data center when 99% of those logs are useless? So you filter them at the edge. You look for key metrics, and you send back only the signals that matter. Everyone is familiar with a CDN. A CDN is just a Network Edge technology. It's about putting your application right out there, so users get a really snappy response. Think about Spotify and YouTube. You know, they're all the same.
But equally, latency plays a really important metric. You know, I'm from New Zealand, and when I'm over here, my internet banking application is completely unusable because it is trying to do API requests, thousands of them, to a back-end server in New Zealand. That latency just breaks the application. So by being able to have API endpoints closer to your users, you get a much faster application experience.
And one of the often misunderstood or opportunities with Edge is a separation of data and control. Now you can actually say, I want to have some data or an application running at the edge that's ingesting data from a user and sending it left while control goes right. So you can say, I want to start replicating data or replicating database records to an upstream system of record, but I want the control and communications for this to come from my application vendor. So you've got some control commands this, and data goes this way.
So, Kubernetes was born from day one predominantly for stateless services and the Data on Kubernetes Working Group have done a really good job helping to transition this to more stateful services. But predominantly, Kubernetes is around stateless and edge applications are also predominantly stateless. They're designed for ingesting, churning, buffering, and caching. They're not designed really to hold data. So, Kubernetes, a stateless orchestrator, and edge stateless workloads really are a perfect match. So, it just makes sense.
However, there are challenges. Kubernetes is a standardized API. There's a whole reason why you get the certified Kubernetes. It's a standardized API. You know if you're using a certified Kubernetes distro at 100% it is compliant with the Kubernetes API. The problem is a lot of Kubernetes providers are not. They want to add value, in speech marks. They want to add value to the native Kubernetes API. But the problem is a lot of that added value locks you into their system. So, you need to be really careful with things like authentication, load balancing, layer 7 load balancing, registry authentication. These things, once you start to use them from the provider, may actually lock you into that particular vendor. Now, the ones in the middle here, the standard deviations - Civo, Linode, DigitalOcean, they basically just give the raw Kubernetes. They don't attempt to touch it. It's just the raw environment and that is the most compliant. The side, right-hand side, Azure and Google, they are also very raw environments, add some value add, but are still quite compliant. The left-hand side here, Amazon and OpenShift, really are quite radically different and you have to be quite careful. Once you start using those systems, you really are locked into those providers. And it makes it very difficult to have a fully transported workload.
So, multi-cluster Kubernetes management is quite a bit different from single cluster. If you think about a single cluster, you can go and manually create users, you can create your RBAC roles and role bindings, you can go and deploy applications there manually, or use dashboards, whatever you want, it's quite simple. When you get two, three, four, ten clusters, things start to get a little harder. When you're talking edge though, you really are talking scale, and you really have to change the way you think about how to manage this. And you have to start thinking, how do I do this all centrally? How do I have users authenticate somewhere centrally, and those users automatically get propagated to back-end clusters? How do I define RBAC roles somewhere centrally and have those propagate? You can't go installing the OAuth extension in all your clusters and managing the RBAC roles. You just can't do that. So really, the three key points there - centralized access and access control, monitoring dashboards and centralized deployments, you really can't ignore those. How do you do this at scale?
When you have multiple clusters, how do you provide auth access to your users? How do you log them in and where they are logged in, how do you define their RBAC roles? You don't want to do that one by one by one plus the white cluster. Yeah, there's products like Okta, products like Teleport, Portainer as well, you know, that help you do this. But you really have to think about, how do I get my users authenticated? You can't go handing out kubeconfig files. You do really have to say, I want to have a single API endpoint that every developer or consumer can connect to, and that proxies to a backend. This is how you manage things at scale.
Same thing with dashboards. Everyone thinks automatically, okay, I've got Kubernetes, I'm going to install Prometheus and Grafana and I've got a monitoring dashboard. Yeah, that's cool, but you don't want to have 47 different dashboards you have to open. You want to try and get a macro or a global view of all your clusters. You can do that with Prometheus and Grafana but you have to architect it that way. You can't just go and install it in each cluster. You actually have to install the Grafana and sorry, the Prometheus Edge agent and send the streams back to a central Prometheus instance. You have to configure things correctly, so you have to think differently.
And bulk deployments are actually quite complicated. How do you take an application and say, I want to deploy this application? I want to do so 50 times, 100 times, a thousand times in the same way, and I want to update it and keep it updated and maintained? You'd automatically default to GitOps and that would be right, but that's only really half the answer. How do you get the initial install configured? How do you ensure that all 50, all 100, all a thousand clusters are running the latest version? How do you ensure that? Chick-fil-A, I've put a little link down there, Chick-fil-A rolled out a couple of thousand clusters and they started using GitOps and found out very early on that you can't just deploy GitOps and pray that it's going to update. You actually have to have some centralized dashboard to see the current status of all of these deployments. Are they running? Have they been updated? What version are they running? So, I strongly recommend reading it. There's a little extract here, but basically, they said their GitOps model started really simple, just traditional GitOps, but they learned really quickly that you have to have a center of control. We found the same thing and we've added all this capability to Portainer as well.
So, how do you ensure success? You've got to get tooling. You can't do this without tooling. You can't manage thousands of clusters, hundreds of clusters, to be honest, even dozens of clusters without tooling. You really have to get the central user access, identity management, PIM/PAM, identity management, access management. You've got to get this. You've got to get the dashboards. One of the hardest things with Kubernetes is learning what you don't know. How do you know what's possible in the cluster? How do you know the commands that are there? How do you know what's running? Your dashboards really do help you see things instantly. You can see, okay, there's a problem there and you can go and triage it. Unless you have really good alerting setup, a dashboard will never take away the benefit. Being on a dashboard, there's a problem there, something's not running right, and centralized deployments.
So for Portainer, we have our platform. I don't want to turn this into a vendor pitch, but we have our platform that does a lot of these capabilities, a high-level abstraction. We try to make things really easy and we try to be the one tool you need to manage Kubernetes. One cluster, 10 clusters, a thousand clusters, or multiple thousands of clusters. So, full Kubernetes API proxy, multi-cluster management, centralized identity management and dashboards, monitoring, everything you need to basically go live with Kubernetes in production at scale. So Adolfo, your turn right there, plenty of time. Thank you.
Let's see about that. Thank you. Well, so the demo, as Neil mentioned, really based on having Kubernetes on the edge. The concept of edge is not necessarily devices, it's just the concept of being distributed around a specific geography. So, we set up Portainer in a way where in this demonstration, we're using the full capability of the Civo Cloud. We set up three Kubernetes clusters in the three sites, almost four that I saw, but I suppose that's a spoiler. The idea is to show in the architecture the growth, how you can quickly grow your Kubernetes on the edge using the concept of a centralized console and repeatability. I'm not going to really go through this too much, you guys can download it later and check it out. But I think the visual talks better than words.
The idea is to have an architecture of a live streaming platform based on RTMP. We're going to have a cell phone here acting as a little endpoint that's going to project you guys. Starting with an ingress server that's going to distribute equally the stream to all three sites. It could have been the fourth, but I didn't have much time for the fourth one. Finally, we have an app stack that's going to show the three streams on the three Kubernetes clusters, each one running on a different site, New York, London, and Frankfurt. So, I'm going to start with SSO. I'm going to keep going back and forth. I'm going to log out of Portainer here.
The first very important key component about having access to a centralized system is being able to have control over who accesses these systems. Here in this Portainer instance, what I have is a set of a mix of data center and edge nodes. So let me group this by the groups I've created.
We have the main Portainer server, this is what you're seeing now, running on Civo New York, if I'm not mistaken. I have a bare metal machine running in Helsinki of all places, and, an AWS data center running in Ohio. And I have the three Edge nodes running on the Civo Cloud, right, one in each of the sites. So, what I'm going to do, I'm back to the presentation, let's start by, as you see, the SSO and bringing the ingest server up, right, this is where we're going to start sending this, the stream and the information to the platform. So, the ingest server is going to be running in Helsinki, and the method that I adopted in this case was GitOps, right, why not? It's one single server, I have control, I'm thinking of only one instance, although if I needed to grow, I might have think, I thought of another method that I'll show soon, but I have more control over that, so I'll just use GitOps. Basically, I'm going to connect Portainer for an environment here that I have, see with my ingest server. So, basically what we do is what I like to say, I connect Portainer. I'll get the name class and ingest server, are you good, deal okay, and, I connected to this right away, Portainer tells me, you need authentication, right? So, again, control, you have to make sure you have all the tools to be able to control your deployment. So, we have a little feature here where we can cache access to this feature, this environment, and here I can just pull the file, yes, there you go. I'll activate a webhook here, and I'm going to force redeployment because the package and I'm not going to force redeployment, sorry. So here what I'm basically doing is I'm telling Portainer, talk to this repository, pull the code, which I'll show later, and deploy this onto the server or better set to this cluster. That's the first stage of the process, which should show up here very soon. Let's see, there you go. So, this is basically my ingest server that's going to bring all the connections and spread that to the other Edge devices or Edge cluster space, right, then what's next?
Let's bring up one Edge environment, let's bring up Frankfurt. It's back to Portainer. Now, this is slightly different, and this is what we think of better control over the way you deploy and repeatedly ingest your manifest onto your clusters. You want to have repeatability, you want to have control, you want to make sure it's the same because you don't want to have surprises, right? So, it's slightly different from GitOps. So, what we do, we have, we create in Portainer what we call an edge group, that's going to concentrate all my Edge clusters, all my Edge Kubernetes clusters, right? The logic behind it is that via tags, and I have a predefined tag called RTMP, I'm going to group each and every cluster that has this tag here in this edge group. So, I've created the group, and now I'm going to connect a stack to the group. Basically, what's going to happen is every time I tag an environment with this RTMP tag, Portainer is going to understand that it has to deploy this manifest onto that device or devices that are connected to Portainer under that group. So, let me add the stack again, let me use, RTMP Edge. I'm going to use, I'm connecting the stack, which is the manifest I'm going to load onto Portainer, to this group. It'll get easier and clearer very soon. And again, I'm going to use GitOps once more. I'm gonna go here, but now I'm going to use another manifest, this is it. You tell Portainer where to get that manifesto, again, I need authentication. It's cashed, and in this case, it's RTMP, there you go. And here, I'm telling Portainer to use credentials because I'm using a private repository of my containers, which is also called Git Edge LOL, right? I'm deploying the stack. So, basically, what I have is a flow where I have a group and a stack, they're linked, but there's nothing happening here. I need to activate one of the edge Kubernetes clusters, and the way I'm going to do it is I'm just going to go here, start with Frankfurt, and tag it with RTMP. So, what Portainer is going to do now is going to realize that I have a device in this group, and it's going to send that stack to that device. It went quite quick because it's a small little container. So, each and every environment that I tag with RTMP is going to fall into this group, get the stack, and get deployed. Yep, got it, right, let's see, now here comes the magic.
Neil gets very nervous. The meatwood is quite slow, a little bit laggy. Yes, this, right. So, let's back to the flow. Next thing we're going to do is going to really going to show this stream happening by deploying the app stack.
So, I'm going to go back to Portainer home, and I have the app stack, and why do I have it in a bigger data center? As Neil had mentioned, in the concept of having locations where you really want to concentrate the big bulk of data because all these endpoints are doing, they're sending the RTMP streams to this bigger data center where I can process that, transcode it, etc. So, here, what I'm going to do is going to use another method of repeatability, which is basically deploying the manifest using a custom template I have here.
So, I have that already in the Portainer database, set to be deployed wherever I want and how many clusters that I have in my being managed by Portainer. Let's see if all goes well, very soon it should show up here, okay, and there we go. Right. It is laggy so it might take some time, some, some minutes until it refreshes. Oh, there you go, that was quite fast, um, but as things grow, sorry, the autofocus is not very good, but you want to grow, right? You need to have repeatability, you need to deploy this in London, you need to deploy this in New York. For the sake of time, I'm just going to rush it a bit and deploy this straight into the other three Edge clusters. How? Very easily, I'll just go to Portainer, my environments, and say, London, you're up. Date the environment. New York, you're up. Now, what Portainer is going to do is ensure that these and it's going to two, it's going to, it's acknowledging on the third cluster, finally, it's up. So, if I refresh this, soon enough, all three sides should be up very soon. I might need to redeploy my ingest, just a minute, how much, okay, two minutes, right, maybe update in two minutes, maybe I'll have a minute to redeploy this, let's see.
Repository, Kubernetes indication, a little glitch, but it will be back, as you see, as you know, if a demo doesn't go wrong at some point, it's not a demo, right? All right, because I want to make sure all three sites are up, it's always just, me just copy it, it has been load-sized up to about 15,000 clusters, scaling there. Oh, we're scaling that beyond as well. We have customers who've got upwards of 50,000 environments. We have a customer who needs us to expand this up to 125,000 clusters by the end of this year. So, Portainer is going through a massive scaling exercise to support that. So, from the whole thing, from one central location, you can manage hundreds of thousands of devices. At the moment, about 15,000 clusters but scaling pretty quickly. We've just signed up for a 50,000 cluster test environment to start doing the scaling testing. So, imagine how much that costs in the cloud, I might add. Okay, come on, come on, come on, and you see here, the whole thing, other thing you can do here, it makes it really simple to deploy applications. You know, you, you kind of choose your own destiny. If you want to deploy with no code, you can. If you, if you want to deploy with GitOps, you can. But everything is happening within the Portainer system. So, you don't, you don't need to break out and use multiple other tools.
See if my ingest server is everything still going, yeah, right, well, sorry, little glitch because I disconnected the servers. So, I'll have to start the stream, 10 more seconds, and we are done.
You know, in the meantime, you can tell some jokes. No, I'm not telling jokes, I'm not, fine, there you go, four connections, and we should see all three screens coming up. So now we have NewYork, Frankfurt and London.
Thank you.
Now, the final architecture is, again, here, just to conclude, yeah, there it is. So, we, we have a booth, if you want to see more, come and have a chat to us. So, we can show you more what's about Portainer, more than just the Edge stuff as well. So, if, uh, oh, yeah, interesting, right, because I did some tests yesterday, and suddenly Frankfurt was faster than New York. And I realized it's not about really distance, sometimes it's about capacity, right? So, at certain times of the day, New York is busier than London. So, lag on London would be shorter and smaller than the one in Frankfurt or, I mean, it really depends on the time of the day and, and capacity that is or traffic that's being sent to that within statement menu. So, it's, it's not, you would think, oh, New York is always going to be faster, not necessarily, no, not in this case. No, no, I mean this was a very simple demonstration. Honestly, there's elements that I left out just to make it faster to deploy. But, yeah, definitely, you would have to have some sort of load balancing or Geo balancing to make sure you're sending the stream to the right server. Yeah, one of our ISP partners is a company called Live Switch. They actually do this as their full-time software stack. When we were planning to use this today to demonstrate it, but we couldn't get it done in the timeframe of the demo. So, we had to do a really cut down version.
Stay up to date
Sign up to the Navigate mailing list and stay in the loop with all the latest updates and news about the event.