In this podcast, Daniel Bryant sat down with Bryan Liles, senior staff engineer at VMware. Topics covered included: the challenges with deploying applications into Kubernetes, using the open source Octant tool to increase a user’s understanding of Kubernetes clusters, and how "serverless" technologies may influence the future approaches to building software.
Key Takeaways
- Octant is a highly extensible platform for developers to better understand the complexity of Kubernetes clusters. Octant runs locally, using the local Kubernetes credentials. It currently displays information about a Kubernetes cluster and related applications as a web page. Soon this tool and resulting display will be provided as a standalone application.
- The goal of Octant is to enable users to discover what they need to discover. The tool aims to provide context relevant to where a user is and what they are trying to achieve. The Octant plugin system allows integration with other tooling, such as logging and metrics frameworks. This aims to facilitate quick problem detection and resolution.
- Cloud native platforms like Kubernetes are complicated, as there are lots of moving parts. The most important challenge to be tackled to increase the adoption of platforms like Kubernetes is “how do we move code from our IDEs to wherever it needs to run with the least amount of friction?”. Testing needs to be implicit, as does security verification, and the acts of deployment. Kubernetes needs its “Ruby on Rails” moment.
- Creating “serverless” systems is an interesting approach, but we may currently be using this technology in a non-optimal way. For example, creating web applications using this technology enables scalability, but can lead to the creation of difficult to understand systems that also require a lot of boilerplate configuration. Arguably, a more interesting use case is implementing large-scale batch processing using simple event-driven models.
- The Cloud Native Computing Foundation (CNCF) has created a series of communities of practice called Special Interest Groups (SIGs), such as SIG App Delivery. This allows folks with similar interests to work together as a community, focusing on solving a specific set of well-scoped problems. There are many ways to get involved, from discussions, to coding and creating documentation.
Subscribe on:
Show Notes
What has been happening to you in the last few years?
- 01:25 I have been doing cloud, from creating servers, serving software from before there was the cloud, to work at the first SaaS company, helped build a cloud with Digital Ocean.
- 01:45 That brings us to 2016, when I went to work at a large bank and realized a couple of things.
- 01:50 Firstly, it's interesting to see what a large team of people can do.
- 01:55 Secondly, banks are interesting, but not interesting for me.
- 02:00 I had run across Heptio in the past - two years ago, I was hired.
- 02:15 I worked there for a year, working on a research project about creating better configuration abstractions for Kubernetes, called ksonnet.
- 02:30 We retired that, and at the same time, we were retiring that, VMware wanted to join us, and that closed in December 2018, where I've been ever since.
- 02:45 Now at VMware, I'm looking at developers using Kubernetes, and we're researching methods to make Kubernetes easier to use for developers.
What do you think is the biggest hurdle for developers migrating from VMs to Kubernetes?
- 03:15 The problem with Kubernetes is the same as everything else; it's new, and it's complicated.
- 03:30 Kubernetes will work in a cloud, on a Raspberry Pi, in data centers and many different types of networking and storage systems.
- 03:35 There are a lot of pieces there to make it work.
- 03:40 A good analogy would be to when you were using a relational database for the first time, querying with SELECT.
- 03:50 Afterwards, you have to learn about JOINs, and then DISTINCT, GROUPing, then sub-SELECT, indexes - it gets complicated.
- 04:00 Kubernetes is no different to that; a complicated system that can do many things, and because it's different to what we have been doing before, people shy away from it.
Is Octant aimed at helping people on their Kubernetes journey?
- 04:25 Octant has an interesting story: over a year ago, we were thinking about what is hard with Kubernetes.
- 04:30 One question that kept coming up was: is something broken?
- 04:40 How do you troubleshoot an application? You use kubectl, which is a Swiss-army knife command-line tool.
- 04:50 You can use kubectl get, or kubectl describe, but what are you describing?
- 05:05 We know what the objects are in Kubernetes, and we know how they are trying to behave, so we can detect misbehaviors.
- 05:10 We can build up a graph of objects in Kubernetes, whether implicit or explicit.
- 05:20 With that, we can tell you what's going on, and what's wrong.
- 05:25 After we built that tool to prove that it can work, then we realized we needed all the context.
- 05:30 So, Octant is a tool that has a dashboard, to see what's going wrong in an application, along with plug-ins for CI/CD, security etc.
- 05:55 A lot of tools before Octant were installed in the cluster, and the way you would access them is they would require super-user access or your credentials.
- 06:10 Both of those are hard: if there's an issue, and you have super-user access, then a problem could be newsworthy.
- 06:15 Guess what - someone ran into an issue with a Kubernetes dashboard, and got into a newspaper.
- 06:20 If you upload your credentials, that works, but it's not friendly to the user.
- 06:25 Right now, Octant runs locally on the machine and you can use your local credentials.
- 06:35 It presents a web-page at the moment, but we're moving it to an application that can run on Windows, MacOS or Linux, that you can run to explore your cluster.
Do you think it might be useful for understanding what you have running in your cluster?
- 07:10 It's a debugging tool as well as an exploration tool - that's why we built the dashboard.
- 07:15 You can click on one object, and if that object is related to another one, you can drill into that one.
- 07:25 What it's built for, is to allow users to discover what they need to discover.
- 07:30 We're helping the whole time by giving you the information that you need in the right context.
- 07:35 If you're looking at pods, we give you pod-specific information.
- 07:45 Kubernetes is an API of APIs.
- 07:50 With Kubernetes, you can create custom APIs through software.
- 08:00 This allows you to extend Kubernetes' capabilities afterwards.
- 08:05 The hard part is the 'unknown unknowns' problem - but because there are standards around creating these, Octant can read the configuration and make guesses as to what they look like.
Do you see a harmony between tools like Octant, Honeycomb (logging), and LightStep (distributed tracing) etc.?
- 08:35 There is a harmony - we created the Octant plugin system to allow people to integrate with other systems.
- 08:45 VMware has paid to have this application built, but we're looking at building it as not a specific VMware tool.
- 08:55 We don't want to make decisions about what you're using to interact with your cluster.
- 09:05 VMware has a metrics product called Wavefront, but other people might want to use Prometheus or DataDog or something else.
- 09:10 Any screen in Octant, you can use a plugin to insert new content, such as graphs for data, so you can integrate into the page.
I see Charity Majors saying half the battle, when debugging complex distributed systems, is finding where the problem is?
- 09:45 I agree with Charity - one of the hardest things is finding out how to quickly solve problems when they occur.
- 09:55 That's what we optimise for, how to solve it quickly.
- 10:00 I recently presented a two-hour workshop for how to troubleshoot Kubernetes.
- 10:05 I went through four or five scenarios, and the interesting thing was that I was able to turn on Octant, and it instantly told me where the problem was.
- 10:15 That's what we're trying to do: if we solve these current problems, then we can solve harder problems in the future.
- 10:25 You shouldn't have to spend ages trying to solve Kubernetes problems; you should be able to solve your own businesses' problems.
Before we move on, how do people get involved with Octant, how do they join?
- 10:45 Go to GitHub, and you can look at the readme or file issues, etc.
- 10:55 You can also tweet to @ProjectOctant if you want to talk with developers.
What else do you think is missing in the Kubernetes ecosystem?
- 11:25 Whenever we start talking about Kubernetes in respect to what we're trying to deliver to our customers, unless you're a Kubernetes vendor, we went wrong somewhere.
- 11:30 It's the same with Linux: I've used Linux since the '90s, and in 2019, we run a lot of Linux, but we don't talk about it any more.
- 11:50 How do we move code from our IDEs and editors to wherever they need to run with the least amount of friction?
- 12:05 How do we make CI/CD, security scanning, building containers implicit?
- 12:15 Can we remove a lot of this complexity by understanding what the user is trying to do, and doing it for them?
- 12:20 I understand that people hate YAML, and they want to put templating languages on top of YAML ... what we need to understand is that the YAML is important.
- 12:35 In 1970s software, you weren't writing in a high level language, but writing machine code or assembler.
- 12:50 We should be thinking of YAML as being that assembler.
- 13:00 We have figured out how to have higher level languages, with interpreters and compilers, and even LLVM's IR.
- 13:05 That's where we need to focus on in Kubernetes: we should be able to think in concepts that we want, and have the tools generate the YAML.
- 13:20 If you think that YAML is the endpoint, we've already failed - but to get there, we need to discover what works and what the future solution will look like.
What do you think about serverless?
- 13:55 With serverless, they go from really simple to really complex.
- 14:00 The first thing you need to understand with serverless is it doesn't remove any of the complexity.
- 14:15 If you're on AWS Lambda, or IBM's OpenWhisk, or Pivotal PFS, or knative - understanding how these systems are put together is complex.
- 14:30 They make a lot of assumptions and have to put in lots of boilerplate, and they have to do lots of co-ordination to make these things work right.
- 14:40 I think serverless is an interesting solution, but most people are thinking of it wrong.
- 14:45 Serverless for serving your web applications with an API gateway from AWS is neat, and we can do that.
- 15:00 I'm not saying don't do it, but think about these better solutions.
- 15:10 Imagine batch, but a scale which is infinite from your point of view.
- 15:15 CapitalOne is a bank with a huge amount of data in Amazon S3, and because it's a bank, it has to be stored encrypted.
- 15:30 Humans are fallible, and sometimes when we put data in S3, sometimes it's not encrypted.
- 15:40 CapitalOne built Cloud Custodian, and looks at S3 and configurations in general and tries to clean up messes.
- 15:50 This is a great thing for serverless because we don't need to allocate compute infrastructure to use this at any scale.
- 16:00 Serverless, if you have streaming data, where one piece of data splits up into other pieces which are serviced in the future, is great for that.
- 16:30 When we think about applications and serverless, think about what else you're giving up when you do this.
- 16:35 You're giving up all these other things that allow you to build interesting applications because you need to fit in this small stateless space.
- 16:50 Maybe serverless on the serving side will grow into something, but I don't understand right now if we've solved the problem at a level that we can say it's a great idea.
Developing serverless applications is hard; you've got a lot of things in flight.
- 17:30 We'll just build more orchestration systems, so now you have a serverless orchestration system - is that any better than where we are right now?
- 17:40 I'm not telling people to stop thinking about these things; they're hard problems.
- 17:45 Serverless is not a panacea - my friend Kelsey Hightower is talking about it a lot.
- 17:50 What we need to think about is that these things are fun, but we need to understand that you need to run your business, and listening to us talk about these things is not solving your problems.
- 18:10 We need to talk in a better language for people consuming our software and talk in the context of their solutions instead of our technology.
How do you see the CNCF landscape evolving?
- 18:50 I don't have a happy answer - the CNCF roadmap is like an eye chart, as there's so much on there.
- 19:00 I have software that I wrote which is on there, which is pretty cool.
- 19:05 That chart itself doesn't really tell you what you would need to do, wherever you are on your cloud journey.
- 19:15 The CNCF is thinking about this and trying to re-organize it.
- 19:20 One of the things that they have done is to create special interest groups around specific topics.
- 19:25 I chair one called SIG App Delivery, where we're thinking about what the definition of a cloud native application is.
- 19:35 There are multiple components, and we have to figure out all of the pieces of the cloud native application delivery.
- 19:45 How are we packaging applications together, how are we building containers, how are we orchestrating these things into production, how we are ensuring they still work ...
- 20:05 The CNCF is a necessary thing, but we're trying to figure it out as a community and ecosystem, rather than a specific company going at it alone.
- 20:30 We should be competing on our solutions, not on our tech to get to the solutions.
- 20:40 CNCF is great because it gets vendors, users and developers together - hopefully, if we get them together, we can build better solutions.
What do you think is the best route into helping out with CNCF?
- 21:10 I don't like the 'go read the docs, then create a pull request' - that's an easy way to tell you to kick rocks.
- 21:15 What I would rather do, is to tell them to get involved by understanding the destination rather than the journey.
- 21:35 We have security and storage in CNCF, but we also have Kubernetes Product Management SIG. Join a SIG relevant to you, listen to the output, read the mailing lists.
- 21:50 What they'll do, if they're tracking features, is that they'll create a Kubernetes Enhancement Process document (KEP).
- 22:00 A KEP writes out a problem, and how someone wants to solve it - see if you can help out with that.
- 22:10 There are other ways other than code that you can help out; tasks that need to happen around Kubernetes itself; shadowing the release people, for example.
- 22:30 There's a SIG-PM where they're trying to build product/project managers, but there's a lot of people process in Kubernetes.
- 22:45 There are people who make sure information gets disseminated, for example.
- 23:00 If you're a developer, and you want to start developing in Kubernetes, go and find a SIG and join in.
- 23:05 It takes software a long time to get into production in Kubernetes; up to a year, for example, so don't get too sad if it takes time.
Can we learn anything from the Ruby on Rails era in relation to Kubernetes?
- 23:55 I said a while ago that "Kubernetes needs a Rails on Ruby moment".
- 24:05 When David Heinemeier Hansson came up with a 15 minute to create a blog post video, people started looking at web development in a different way.
- 24:15 The whole idea of people using convention over configuration (the defaults getting you closer to success without configuration).
- 24:35 Kubernetes needs this moment as well, but what is that moment?
- 24:45 When we're building tools, we should think about convention over configuration - it should be easy to stand up Kubernetes on your desktop.
- 24:55 It should be easy to get your application inside a Kubernetes namespace - it shouldn't matter if it's Java or not.
- 25:15 My keynote at KubeCon San Diego will be looking at Kubernetes in 20 minutes.
- 25:30 Even if we can't define it now, we need to understand that is what we are looking for, rather than another new tool that introduces brand new ideas.
- 25:45 It's nice that we're understanding what doesn't work, but it's not great for everyone when we're not sure what tools to write.
I know you are a keen reader. Can you recommend any books?
- 26:10 I haven't read a good tech book lately; some decent ones, but not ones that I'd recommend - just find some that interest you.
- 26:25 "Team of Teams", by General Stanley McChrystal, which is about teams of teams, and how they work together to solve a bigger mission.
- 26:45 The story is told in the context of the US in Iraq, and how they were being beaten for a while despite military might.
- 27:00 It goes through how to build teams together and what they did.
- 27:05 They have an example in chapter 11 about team leadership.
- 27:15 A lot of team leaders think about teams in terms of chess.
- 27:30 You're moving pieces in a specific way to capture the king.
- 28:40 When you have people involved, they aren't chess pieces, so you need a new analogy.
- 27:50 If you have a garden, you can pick the best seeds, and you can till the soil and use the right amount of fertilizer.
- 28:00 You don't grow plants, they grow on their own in the best situations.
- 28:10 As a team leader, you want to get them the best situation for your team members to grow.
- 28:20 When you approach leadership like that, you'll get the best results instead of thinking about it as a technical lead.
- 28:30 Another book is the "Mathematical Universe" [by Max Tegmark], which is a story about how our universe is built on mathematics.
- 28:50 It's nice hearing how people are exploring our universe with mathematics.
Mentioned: