Lessons learned from 3 years inside CNCF
Speaker: Cheryl Hung
Summary
Cheryl Hung, CNCF Director of Ecosystem and AWS software engineer, shares her journey in software engineering, from her education in computer science to working at Google and learning about observability, reliability, and scale. She underscores the importance of community building and networking in the tech sector and provides insight into current trends in infrastructure, emphasizing the surge for sustainable solutions in the era of vast data and IoT devices. Cheryl also elaborates on her role in the ARM software ecosystem and its adoption due to its cost-effectiveness and power efficiency, touching on the complexities of migrating applications to ARM and the advantages of ARM Kubernetes nodes. Wrapping up, she motivates viewers to network, learn, and give back to the community, while sharing her views on Kubernetes versus serverless and ways to contribute to CNCF without a technology specialization.
Transcription
This conference has been really, really fun, and what an amazing location to be in. This venue, I feel like, and the food here has been really, really, really good. So, I really want to say a big thank you to Mark and to all of the staff and the team for making this happen. So, my name's Cheryl, and I want to talk a little bit about some of the things that I've learned through the last couple of years working in CNCF.
CNCF is the open-source foundation that's behind Kubernetes and lots and lots and lots of the other open-source projects that we all use and love.
I'll talk a little bit about some of the most memorable things from my time there and talk about some of the stuff that I'm working on now.
So, a little bit about me: I'm a Senior Director at ARM. That's my kid, that's my daughter on the right. I'm from London, UK, and I like board games, I like long walks, I like new technologies, and yeah. So, if you go to LinkedIn and look at my CV, this is what you would see. So, I want to give you the quick view, and then I want to give you the actual view of this career path.
I started out at Google as a Software Engineer. I was building stuff on Google Maps, adding features for Google Maps. I left in 2016, just as containers and Kubernetes were coming up, moved into Developer Advocacy, started a meetup called Cloud Native London, which is still going, we have about 7,000 members, and joined CNCF in 2018.
So, that's the quick view, and now I want to give you the actual view of how this happened because I thought it was really interesting yesterday listening to Woz, who was talking when he said, "I'm this tinkerer, I'm an electrical engineer, I do stuff with signals and chips and processes." So, I joined Google, or I decided to join Google when I was a teenager because I heard they had free food and massages and really cool offices. I was like, I don't know what computer sciences or software engineering are, but I'm good at math, and this sounds like a nice place to work at. So, okay, that's it, that's my career path. I'm going to go be a software engineer at Google, and that's it, I'm done for like the next 40 years until I retire.
I did then go and study, and you know, get a computer science degree and go and join Google. But after five years of being at Google, having my projects launch and then unlaunch and so on, I got a little bit disillusioned, let's say. Nothing that I did at Google exists today. Nothing. It's very sad. So, as coincidence happened, I left in 2015/2016, and I started looking around for new technologies out there. I went to a lot of meetups, I went to a lot of conferences, I went and looked at blockchain, I went and looked at different languages, basically just went and investigated all kinds of different technologies. And I realized quickly that containers, Docker containers, and Kubernetes were just what I'd already done at Google.
So, I was like, "Cool, I already know this. I can go out there and talk about it and do stuff." That's kind of how I fell into infrastructure. I was never the sort of person who said, "I'm going to go out and be building big distributed system stuff." It was just like, "Yeah, I've done this at Google, I can go and do this elsewhere." Same with Developer Advocacy. 2016 was really early for Developer Advocacy. I feel like now there's a bit of understanding about Developer Relations and going out and talking to developers. At that time, it was because I was like, "I want to be the sort of person who flies around the world and does keynotes on big stages," and that was kind of it. It was very, very sort of like, "That's because I want, that's what I wanted to do," and that was it.
Similarly, starting the Meetup, Cloud Native London, I started it because I was looking around for places to speak, and there was a Cloud Native Paris and a Cloud Native Berlin, and there wasn't one in London. So, I was just like, "Sure, let's start it and see who shows up." I think I picked a date that was about two months away and said, "I'll figure out. I have nothing, like no speakers, no venue, no sponsors, nothing." And then the very first one, over 100 people showed up. So, I was like, "Oh, cool, there's some demand here," and that's kind of how I moved into the open-source community and started to meet people and started to go around.
And then joining CNCF was also kind of a matter of timing and luck. I had a job offer from AWS. Anyone from AWS here? Good, I didn't want to work for AWS, but I had the offer, and I just slacked Chris Aniszczyk, the CTO of CNCF, and said, "I've got an offer, but is there anything interesting at CNCF that you think I would be a good fit for?" And he said, "We're going to hire for a Director of Ecosystem role in two weeks, and you will be great. You should just apply for that." So that's kind of how it happened. It was just a lot of good luck and good timing happened in this. I got into Developer Advocacy really early, I got into containers and Kubernetes early, I got into open source early, and CNCF in 2018 was still pretty early. That was when there was still Docker Swarm and Kubernetes, and it wasn't clear who would be the eventual winner.
So, I wanted to highlight some of the things that I actually had some moments at CNCF. The first one was called "After CNCF, you could be..." This is when I interviewed with CNCF, so I was waiting on the phone for a call from Dan Kohn, who was the Executive Director. I was getting kind of nervous, I had butterflies in my stomach, and my phone rang. I picked it up, and I said, "Hi, I'm Cheryl," and he said, "Hi, Cheryl, I'm Dan, Executive Director of CNCF, and after you're done with CNCF, you could be... you could run your own startup, or you could go get a job at one of our member companies, or you could be a VC."
And I did this double-take. I was like, "Who's this guy? I thought I was supposed to be interviewing to get this job, and the first thing he says to me, before I've said anything, is 'After you're done with this job, you can go do something else.'" I really had a double-take moment when I was like, "Did I already get this job and I didn't notice?" But I realized over time that's how Dan thought about people. He really thought it's not about, "Can you do it today, yes or no?" It was always about, "How can I set you up in a couple of years' time so that you can have those opportunities to do what you want to do?" So, I've tried to remember this and I've tried to apply this when I talk to other people and I help other people, especially people who are early in their career. It's always very satisfying to say, "After, you know, after you're done with this, you can do so many more things, and you can set yourself up. I can help you set you up for more things." So, I think this has really colored how I look at people and interact with people.
The second thing he told me was, after I joined, this was after I'd actually started at CNCF, "Failure is an option." This was when I was trying to figure out what the heck I was doing there. You know, when you start a new job and you're just like, you get thrown into the middle of projects, you have all these new ideas for things you want to do. So, I brought him this long list and I told him, "Can you help me prioritize my time and what I need to do?" And he listened to me very intently and then he said, "Cheryl, failure is an option."
What he meant by that was, "There's no perfect answer to any of this. It's much better to try a bunch of things and for them to fail than it is to not make any start at all because you're so overwhelmed with what's going on." And I've always thought that was an extremely kind thing to say to someone in their first month of a new job, "Failure is an option." I think we all get a little bit too worried about trying to be perfect in this, and it was just such a relief for him to say, "It doesn't matter. You can try a lot of things, a lot of things won't work out, that's absolutely fine."
And the third one I've just listed broadly is as people and timing. And I wanted to highlight this in open source in particular because so much of open source happens when you get people in a room like this, you get conversations happening, and you find someone else has the same problem as you do, and just things kind of work out, and you start solving the same problems. To give you an example here, a couple of years ago, a consulting friend of mine said, "I've been working on this maturity model, so you can look at a company and say, 'Here's your technology, your processes, and your people, and how Cloud Native are you on a scale of one to five from all these things?' And I want the CNCF to publish this."
And I said, "Oh, okay, let's talk about it a bit later," and it kind of fizzled out. About a year later, another friend of mine, also in a different consultancy, said, "Cheryl, I've got this idea for a maturity model where we have these different levels about Cloud Native. Can we talk about it?" So, I was like, "Okay, that sounds really familiar." And as it turned out, I did a bit of searching and another company published their own Cloud Native maturity model. So, I was like, "Okay, if there were three people who are interested in saying this at the same time , probably means there are other people who are interested in it as well." So, I got them together, formed this group which is now called the Cartographos Group. Cartographos is Greek for mapping; the idea is they want to map out the journey of Cloud Native. But it was purely because of the timing that these things happened. There was a whole year between the first time I talked to someone and then a year later when this actually all kicked off. So a lot of this is just like trying to figure out, trying to find the right opportunities and the right timing, and being open to new things.
Speaking of timing, I want to talk a little bit about some of the trends in infrastructure generally, leading on to some of the stuff that I'm looking at ARM.
So, we have more data, more connections, more IoT devices, more everything than ever before. At the same time, we have all these pressures to be more sustainable. We need Net Zero by 2050. Moore's Law is kind of coming to an end; it's not so easy to get performance boosts as it used to be. Markets are in a bad place at the moment, and there's a lot of politics, both local and global, that are impacting our work as well. So on one hand, we've got increasing demands, and on the other hand, we have to do it in a sustainable and sensible way.
A little bit about ARM: here are some things that are the same age as ARM. In 1985, they discovered the wreck of the Titanic, George Michael's Careless Whisper was the best-selling single of that year, Nintendo released Super Mario Brothers, Microsoft launched Windows 1.0, and Back to the Future was the biggest movie of that year. ARM architecture has evolved from ARM V1, which was used in the BBC Micro, to iPhones on V7s, Raspberry Pi, and now we have MacBooks with ARM chips, like the M2 MacBooks. A few years ago, AWS launched Graviton so that you could actually use ARM in the cloud as well.
The reason for this is simple: ARM has always focused on power efficiency and low cost, which is very desirable for mobile phones but also very desirable for data centers. Of course, it's not just AWS; now all the big cloud providers, including the Chinese Cloud providers, have also launched their own ARM-based services.
On top of that, you have all the other stuff that needs to work with the ARM ecosystem. So, ARM focuses on the IP and the silicon at the bottom level, licenses those out to partners who build and manufacture chips, and then at the top level, you have the software ecosystem, which includes operating systems, languages, and everything else that needs to run on ARM.
That's where I play: the software ecosystem, up at the top. My goal is really to make it super easy and simple to adopt ARM in the cloud. This is a non-exhaustive list of some of the stuff that I work with, my team, and some of the open-source foundations that we work with as well.
This is not a sales pitch, but 48 out of the top 50 EC2 customers use Graviton, including a bunch of famous names. If you wanted to use ARM with Kubernetes, this is just a very broad list, and it's very early, so I would love feedback from you. If you're looking at ARM and you're thinking about it, come chat with me later.
First thought is, it's not just Kubernetes. You don't just have Kubernetes in isolation and nothing else. You need, I don't know, Istio and/or Flux with GitOps, and Prometheus for logging. You actually have many, many applications. So if you wanted to use ARM, this is really a migration game. You have to look at all of your applications and figure out which one of those have images, what versions do they work with, and what features do they have that work with what you're currently working on. So Kubernetes is not just Kubernetes; you've got many, many other things to look at.
Secondly, you would probably start with x86, and you cannot move all of those to ARM in one go. That means you might be mixing some x86 with ARM nodes at the same time. You might have your control plane in one and the worker nodes in the other. This can cause some follow-on changes, such as having to change how you create clusters because now you have different kinds of nodes. If you have daemon sets, that can be a concern because they run on every single node, and if there are different nodes, that can be a problem as well. Or maybe you need multi-arch support, which again leads to a little bit more complexity, and how do you test it?
Your workloads now have different performance characteristics, so does that mean that you need to put different kinds of limits on the different architectures? And then, how do you actually control it? If you have a mixed pool of x86 and ARM nodes, then you need to possibly use node affinity or taints or tolerations to control and say, "These things can work on these nodes, but they can't work on these nodes," and so on.
These are very quick highlights, and again, if you are looking at ARM or you're using ARM and you want to come and chat with me about it, please do. I will be here all day, and I'd love to hear your experience with it, particularly because I have a talk at KubeCon about multi-arch, so I have about 70 days to figure out how it works and to hear some experiences from you.
Right, ARM on ARM. You wouldn't be a good tech company if you weren't dogfooding your own products. ARM has a very high-intensity process called EDA (Electronic Design Analysis), which is part of the design phase when you're making a new chip. This used to run on x86, and they moved it to Graviton in 2019. The nice thing about it is that it was 60% faster, saved a bunch of power, and lowered the costs. So ARM looked at this and thought, "Okay, this works for us, so hopefully this works for you as well if you want to try it."
The last thing I want to emphasize is the importance of people and timing. There are a lot of amazing people in this room. Even just being here yesterday over lunchtime, I chatted with some amazing people who are working on really niche stuff and really interesting problems that I hadn't come across before.
So I really want to encourage you to go and talk to as many people today. Talk to the other people in this room, make friends. There are so many opportunities out there in the world that you're not going to be able to find by yourself, so the more people you talk to, the more you can find those opportunities. I think the most satisfaction that I had in my time at CNCF was encouraging one of the Technical Oversight Committee members to run for it, and they'd never even thought about something like that before. But I told them, "You're a great community member, you're incredibly knowledgeable, you should do this," and they ran for it and got the spot. Encouraging people to come and speak at the meetup that I run, especially young people, has been really satisfying for me. So when you have these opportunities as well, pay them forward. Offer other people the opportunity to get involved with things that you're doing as well. And that's it for me. Thank you so much. These slides are on my blog at oyshell.com. Thank you.
Thank you, Cheryl. Would it be okay if we got some questions from the audience? We have a few minutes.
Yeah, sure.
Does anybody have a question that they'd like to ask? Somebody has to have a question.
So, being involved in CNCF, you probably would have seen a lot of trends that are going in the infrastructure space. So what do you think would be the majority in the future? Would it be serverless, or would it be Kubernetes-based ecosystems and partner companies?
I would say serverless hasn't taken off to the extent that perhaps it was hoped a couple of years ago. I certainly see Kubernetes as the default. I think serverless will continue to be useful for gluing bits of applications together, but for me, Kubernetes has that right level of abstraction. It gives you the right amount of control, and it's easier to do the migration. So definitely Kubernetes for me.
How can we contribute to CNCF? Do we have to be specialized in Kubernetes or Terraform to contribute to CNCF, or as a normal Cloud engineer or a DevOps engineer, can we still contribute to CNCF?
You definitely don't have to be specialized in any particular technology. I think there's space for everyone in the community. There's a lot more than just code that is valuable. The experiences that you bring are really valuable, the things that you know that you can teach other people. So you definitely don't have to be a hardcore Kubernetes contributor to be part of the community. I would recommend going and talking to other people, finding out what they're doing, what problems they're solving, and getting into the community that way. And do some public speaking as well.
You mentioned using ARM Kubernetes nodes. I don't really work at the chip level, but what would the benefits of that be?
Lower cost, basically. The quoted numbers are 20 to 40% lower. That's what Airbnb found. It's not actually 20 to 40% necessarily because you have to run multiple pipelines, so you sort of spend a little bit more on your pipelines. But effectively, it comes down to cost and power consumption.
Thank you, you're welcome.
Cheryl, thank you so much.
Thank you for having me.
Stay up to date
Sign up to the Navigate mailing list and stay in the loop with all the latest updates and news about the event.