Cloud OnAir: 3 reasons why you should run your enterprise workloads on GKE?

WESTON HUTCHINS: Welcome to “Cloud OnAir,” live webinars from Google Cloud We are hosting webinars every Tuesday My name is Weston Hutchins I’m a product manager on GKE INES ENVID: And my name is Ines Envid I’m a product manager in GKE as well, focusing on networking WESTON HUTCHINS: And today, we’re talking about three reasons why you should run your enterprise workloads on Google Kubernetes Engine So you can ask questions at any time on the platform And we have Googlers on standby to answer them With that, let’s get started So it’s no secret that containers are generating more and more buzz every day And it’s pretty much a requirement for enterprises today to have containers as part of their long-term IT roadmap and strategy [? Gardner ?] estimates that by 2020, 50% of global organizations will be running containerized workloads in production across a wide variety of industries, including finance, media, manufacturing, and more So as part of any strong container management platform, you need to make sure that Kubernetes is in the conversation for orchestrating those workloads as you start to move from virtual machines over to a container-based infrastructure Now, for those that aren’t familiar with Kubernetes, this isn’t going to be a deep dive or anything like that But I did want to at least talk a little bit about some of the benefits that Kubernetes gives you out of the box Kubernetes is an open, portable, container-centric management platform And a lot of what Kubernetes does is build a platform to help you extend and manage your applications that are containerized on the underlying infrastructure that you have For me, Kubernetes is really two main key components The first one is that it’s an infrastructure abstraction layer When you go and create a Kubernetes cluster, Kubernetes hides a lot of the underlying infrastructure, things like networking, storage, compute You don’t really have to think about the nuances of that from platform to platform Kubernetes gives you a standardized infrastructure layer to program your applications against The second piece is that it’s a set of declarative APIs for managing application-centric workloads So as you start to roll out applications, there are a number of things that you need to handle, including things like making your application available, health checking, making sure that you’re load balancing between different services in your cluster And one of the nice things about Kubernetes is that it has built-in primitives to handle a lot of those application management features for you automatically Kubernetes can be a little complex, especially when you’re getting started But if you think about what it normally takes to run a production grade application, there is a ton of things that Kubernetes gives you out of the box just to make your life a lot simpler So just to cherry pick a few of these, looking at scheduling So across the fleet of instances that I have set up around the world, how do I decide where to take containers and pods and schedule those across the different nodes? How do I ensure that my applications are always up-to-date and healthy, even though I know that failures can occur? Kubernetes allows you to go and restart those pods and restart those applications when things don’t go quite right We handle things like load balancing and scaling Kubernetes has built-in primitives for making sure that your cluster is always able to handle the load that hits it on peak hours, and then scale back down during off-peak hours to save you money when you’re running your cluster at scale It has built-in primitives for storage, volumes that work across a number of different cloud providers, and gives you a nice, portable abstraction layer to say, I want disk storage and have that work well across GCP and AWS and even on on-prem systems And then it includes a lot of built-in tools and services on top of everything that Kubernetes gives you out of the box, so things like logging and monitoring There’s a ton of extensions and pluggable frameworks for making your cluster observable and making your apps observable within those clusters So while Kubernetes can feel a bit daunting, I think as people start to get ramped up with the platform around Kubernetes, you’ll see that a lot of these features that Kubernetes gives you means that you don’t have to think about these yourself when you’re rolling out your containers So what we’ve actually seen is an increase in usage in Kubernetes quite substantially The rapid adoption has been dramatic I went to KubeCon here in Seattle just a couple of years ago and there’s been a 10x increase from the latest KubeCon that just was hosted in Copenhagen. And recent surveys suggest that over 54% of Fortune 100 companies are using Kubernetes in production today So there’s really no better time to get started experimenting with the containers and using Kubernetes as your orchestration platform of choice But by far, the easiest way to get started with Kubernetes is to use Google’s fully managed Kubernetes solution,

Kubernetes Engine Some of the biggest benefits to using Kubernetes Engine is that you can start your cluster with a single click And once you have that cluster provisioned, you can view all of the workloads in the Google Cloud console from this beautiful UI screen So you can see things like what applications are running across my cluster, what services are there, how do I set up service discovery and load balancing across the different services within the cluster, and even start to tweak things like configuration of secrets and config maps that are in your clusters all from the Google Cloud Platform console But by far the best part about GKE and the one that we hear repeatedly from customers is that Google manages the cluster for you so you don’t have to manage it yourself There are a lot of day two operations that are just challenging, things like making sure that etcd is scaled, making sure that etcd is backed up, having high availability components as part of your control plan that can resist outages There’s a lot of domain knowledge that comes in managing a Kubernetes cluster and Google Kubernetes Engine handles a lot of that hard work for you So in this webinar, we’re going to primarily be focusing on a few of the attributes and considerations that you need to think about when you’re starting to move from a VM world over to something built on containers and Kubernetes There are a lot of fundamental questions that you need to start thinking through, such as, how does Kubernetes fit into the operational model that I already have? I have people that are used to doing approval flows and provisioning machines Can Kubernetes fit into that system very well? There’s a bunch of other things that you need to start thinking about, like scalability, availability, security And for this webinar in particular, we’re just going to be focusing on three reasons why you should start thinking about Kubernetes and especially start thinking about using Google Kubernetes Engine So the three key areas we’re going to be looking at are reliability, availability, and how Kubernetes and GKE fits into your company’s operational model So on reliability, Google Kubernetes Engine is already trusted by many enterprises today And the reason is that we’ve been doing containers at scale for more than a decade right now Internally, containers have been used as Borg, our internal container orchestration system all the way since 2006 And then we started to see containers really pick up mind share in the developer community So Docker was announced in 2013 and started to get a lot of traction And this is when we started thinking about how we could help accelerate the container-based application environment by offering primitives that helped you run applications built on containers And this is really where we took the lessons we’ve learned from our internal system board and started applying it towards Kubernetes And around 2015 is when the 1.0 GA of Kubernetes launched And shortly after that, we launched the general availability of our fully managed offering, Kubernetes Engine And so the key message here is that we’ve not only been doing containers for a long time, we’ve been doing Kubernetes for a long time So ever since Kubernetes hit GA, we’ve had a fully managed solution on top of Google’s Cloud Platform And we’ve learned a ton by managing and operating this over that period So there’s a ton of domain knowledge that our SRE team has and our dev has about what it takes to run Kubernetes at scale Some of the nice features that GKE gives you out of the box, in terms of reliability, is just making sure that your cluster is– you can trust Google to run your cluster for you So we do a lot of things where we take the control plane and the complexities of running that, and hand that over to our SRE team and a lot of the automation we’ve built to run Kubernetes here on GCP So some of the features we give you are a managed control plane, where you can turn on features like auto-repair and auto-upgrades, which I’ll talk about in a minute We do a ton of testing on Kubernetes versions and node operating systems to ensure compatibility And we have a long list of configuration matrices that we’re always testing to make sure that we’re rolling out new versions of Kubernetes that will not break your existing cluster One of the really key features of using something like GKE is the deep integration that we have with the [INAUDIBLE] Google Cloud Platform So we have an amazing [INAUDIBLE] system that allows you to boot our cluster faster than any of the other cloud providers We have deep integration with our networking and VPC stack, which Ines will be talking about shortly And then you can get features like CI/CD, Container Builder registry, logging and monitoring tool in Stackdriver, to just give you that overall confidence in operational ability to consistently deploy applications to your cluster and then monitor those production apps as they’re out there in your GKE cluster But by far, and I mentioned this before, the most important thing is that we

have a set of SREs that are always monitoring your cluster and making changes Some of the actions that they perform are keeping etcd backed up and recovered in case of a disaster recovery, monitoring for global outages and making sure that we can respond to those quickly, and then alerting you when there are problems in the cluster that you may be hitting, like scalability limits of Kubernetes, such as too many name spaces or too many services that you’ve created And all this comes out of the box just by using GKE’s platform So now, I want to talk about availability And we recently launched a number of new features around availability These features are regional clusters, node auto-repair, and regional persistent disks, which just went to beta The first feature that I want to cover is regional clusters So we announced the GA of this last week Regional clusters is an improvement on these zonal clusters that we were offering for GKE And what it does is it spreads your control plane across three zones in a region And it also spreads your nodes across three zones in a region to give your app increased availability as well All of this communication happens through a highly tolerant load balancer system so that even if one of these were to go down, you could always shift traffic over to a new zone So the biggest benefits with having this high availability multi-master setup is that you get an increased SLO, an increased up time as part of GKE, where you get three and a half [? nines. ?] But by far, the feature that we hear most of the customers really wanted was this ability to do zero downtime control plane upgrades So as we start rolling out new versions of Kubernetes, we will actually do a one by one upgrade of your zonal– or sorry– of your regional control plane and upgrade each zone at a time so that your control plane is always having the ability to serve traffic for your applications As a quick aside, I do want to talk a little bit about upgrade strategies And these are just general best practice tips when you’re thinking about, how do I keep my app updated and highly available during an upgrade? So I think one of the first tips is using something like regional clusters to make sure you have zero downtime, but let’s cover a few other general best practices here So just a brief primer on the architecture, the way Google Kubernetes Engine is set up is that you have a control plane, which is a set of master nodes that are managed by Google And then you have your nodes, which are controlled by you in your own project that run your workloads And from the control plane aspect, Google manages the upgrades for you So we will automatically upgrade the control plane as new versions of Kubernetes are available Now, the thing to note is that we don’t always upgrade every cluster to the latest version of Kubernetes We typically pick a stable version that’s usually one or two releases behind However, you as the customer always have the option of trying the latest and greatest by clicking the upgrade available link inside the Google Cloud console or running it from a G Cloud command And the most important piece about this is you have control over when you want to upgrade to the newest versions, but we will always make sure that security patches and other fixes are performed on your cluster for you automatically And as I mentioned before, if you really want to have increased availability, use zero downtime during upgrade, as it’s a much better model for maintaining the up time of your cluster So past the control plane, you need to think about how you’re going to keep your nodes upgraded, to make sure specifically that they’re available while you’re serving traffic So there’s a couple features that I wanted to mention First off, you can choose to turn on auto-upgrade for your cluster And this is particularly what I recommend for customers that want to make sure that their cluster is always kept up-to-date with the latest security patches Or if you want more control, you can choose to turn off that feature and manually upgrade your cluster as newer versions are supported One feature that you can use to control when upgrades happen is called maintenance windows And this feature allows you to set a schedule and a four-hour window of which you say upgrades are allowed to happen during this particular time period And that just gives you a little more confidence that your cluster will always be able to serve peak traffic, but then we can keep things upgraded when less load is hitting the cluster But in terms of upgrade strategy, there’s two common patterns that I’ve seen about keeping nodes up-to-date The first is just what GKE gives you out of the box, which is a rolling update If you go and click the button showing in the bottom right-hand corner here around node upgrades are available, it’ll walk you through a wizard where we will upgrade your cluster one node at a time in a rolling upgrade fashion This is nice just because it’s simple and easy, but it does have one big downside in that you need to make sure that your cluster can operate with one reduced node during that upgrade process So we will constantly be taking a node down, draining it,

and then adding it back with a newer version of Kubernetes One way to handle this is just to provision in an extra capacity during your upgrade process And we hope to make this more automatic and easier in the future The second common strategy that I see is migration with node pools So one of the nice things about node pools is that it allows you to create a group of nodes on a newer version of Kubernetes and do a form of traffic shifting over to that new version So what you can do is create a second node pool as part of the same cluster on a newer version and then one by one, use [? kubectl ?] to go and drain the nodes, and then [? cord ?] on the nodes, which will then start to reschedule pods on the newer version of Kubernetes And this allows you a little bit more control over the upgrade process and also gives you the ability to roll back in case things start going wrong So this is a really nice process that you can use in order to go and upgrade your clusters to the newest versions One other feature that we just announced GA for is auto-repair Now, your Kubernetes cluster, your nodes can go down for a number of reasons There can be kernel bugs They can be unreachable You can have out of memory or out of storage exceptions And what auto-repair does is it constantly is monitoring and health checking your nodes And if it detects a problem such as a node not responding, it will go and respin up those nodes and keep that cluster up-to-date What we found from customers is this is usually a feature that they love to just have on in the background and constantly keeping those clusters up as problems are introduced The other new feature that we just announced is regional persistent disk support for Kubernetes Engine Now, one of the nice things about regional persistent disks is that typically, doing distributed storage replication is really challenging And what regional persistent volumes give you the ability to do is automatically replicate between two zones in a region So rather than having to think about how do I shard my application such that it’s automatically replicating across different zones, you can just replicate at the storage layer and have Google handle all of that for you automatically So this is a huge time saver and complexity reducer for those that want to have the increased availability with having storage replicated across multiple zones in a region And we’ve really seen a lot of customers come to depend and rely on the high up time and availability of GKE, especially around events like Black Friday, where we can see a massive increase in traffic We have gone through this many, many years and are able to scale up our clusters in order to meet customer demand And with that, I will hand it over to Ines INES ENVID: Thanks so much, Weston I want this, well, then to focus on the operational model for your Kubernetes deployment And the reason why I’ll be focusing on the networking, the reason why this is important is because enterprises run their Kubernetes deployment as part of the rest of their deployments And we need to make sure that everything that we build, in terms of the models fit into the VPC and to a scale the current deployments, they also work for the Kubernetes deployments So I’m going to be focusing on a couple of very important foundational concepts that we have released and that are going to allow us to really provide you with the tools and the models that you’ll need for your deployment So I want to back up a little bit and talk about Google VPC network There is a way this is significant, is because it really provides very significant advantages over some other networking models So we have been already work as Google, we have significant expertise on building global networks And we have done so for our own business areas As we built our public cloud model, we really leveraged from that global networking model, exposing through the VPC the possibility for you as a user to build your global piece of Google network for your enterprises The reason why that is significant is because it significantly reduced the amount of operational burden that you will have on having to [INAUDIBLE] regional connectivity domains and having that offered to you by default It will as well mitigate the fact that you need to connect separately from your data center, say, if you have on-prem systems, through separate connections, like Google Cloud network will offer you a single entry point for all your deployments globally And as well, it allows you a centralized model for having your security and policy construct Now, when we’re applying that to Kubernetes, we are really

making sure that everything that we have built for any type of deployments, it works natively on Kubernetes So with that, I’m going to introduce the first key construct that you will need for your Kubernetes deployment It is what we call native VPC Kubernetes Engine clusters with Alias IP The reason why this is significant is because when we are working with container deployments, really the granularity at which we are providing networking functions is not any more the granularity of a VM instance You need to provide the networking functions and the granularity of IPs with represent spots And that, as a conjunction, represents services that get exposed to a user Now, when we are monitoring the networking functions for those spots and services, things like IP allocation, they are done through their native VPC GKE clusters They’re done natively as part of our GCP model You can allocate, as part of your VPC subnet, primary range that it serves as the IP allocation range from which you are allocating IP for pods Then you can have a secondary range that you use for allocating your service’s IPs You can keep those two separated But the nice thing is that by doing the allocation as part of your VPC model, you don’t need to have two separate IP allocation systems, one for your pods and your service as part of your GKE deployments, another for your VMs as the underlying infrastructure Additionally, the ability to program directly these pods as part of our VPC infrastructure, it makes sure that we are addressing natively those spots and those services, program in our networking stack That allows us to scale better It provides us with advantages like security because we’re able to understand which spots are allocated in which VM instances And therefore, we’re allowed to implicitly apply anti-spoofing, making sure that the IPs from the pods are really the ones that are being configured by you as a user And in general, it really gives us the foundational block to provide every single networking service, whether we’re talking about load balancing functions, whether we’re talking about routing, exposing those routes to the internet Everything, as part of the GKE clusters, pods, and services, are treated the same as we will do with our VM instance deployment Let’s now talk about the operational needs for your Kubernetes Engine So we have seen the following paradigm when deploying Kubernetes Engine clusters On one side, there is usually network and security administration that needs to have some control, if only to understand what are the rules of connectivity, what are firewalls, and what are the security policies through which the deployments have been governed as part of those deployments So there is a need for having a centralized view and a centralized admin for certain very sensitive security policies and routing policies On the other hand, you have the paradigm in which dev op teams, they need to be agile enough so that they have the autonomy to be able to deploy and to operate their applications without having to have a very tight dependency on the infrastructure and those administrative policies So we see requirements, so for those dev ops to have separate billing, separate quarters, separate IM control so they can be able to operate their apps without stepping into each other’s feet The last is that when you’re deploying your GKE apps, you want to make sure that they are able to communicate privately The reason why this is important is because when you’re having a microservice set of environment and you’re breaking your [INAUDIBLE],, it’s very important that those apps are still able to communicate without having to be exposed to the internet And these requirements are very important when you have CI/CD environments in which you want to promote certain apps from deployment to test, to staging and production So with that, I want to back up a little bit because the way these problems have been resolved

has been suboptimal And that is where Google Cloud is stepping up to provide a solution that really makes those requirements with a solution that is suitable for your operational needs So why those deployments have been suboptimal? So for example, one very traditional way of deploying GKE clusters, it will be to have it in your own account, in your own project so that you can have your control for billing, for quota, for identity and access management policies restricted to that cluster so that you have the separations, its cluster and each set of developers Then they’re going to be separated by these type of controls And then they communicate through VPC control But if you are really deploying an application A and application B as part of the same cluster, you are really losing the control on some of those policies So you get the communication through the VPC network, like in this picture, but then you are losing the ability for those developers to have their own quota, to have their own billing The other option is now, you separate them fully You have application A, which is really restricted to a cluster, to a single account Then you have application B, which is restricted to another cluster, to another account But now, the VPCs that you really connect those clusters to, which provide you your [INAUDIBLE] connectivity domain, are separated You’re not really able to communicate easily, those applications And then you create silos So you are trading off the communication that you had in the previous picture with the able to control each of the applications that you gain in this picture In none of them you have the optimal setup for being able to separate policies and control for those clusters, but having private connectivity without creating silos That is where the model for the shared VPC comes in place And if you see at the deployment that we have here, what it allows you is to really have your clusters deployed separately in different projects that will have separate billing accounts for those applications, separate quarter, separate identity and administration, and identity management Those policies are still completely separated for application A and application B, but there is still a single VPC network that is able to connect both applications And not only is it able to connect privately both applications, but it’s also able to provide a common space in which there can be a set of route, a set of firewalls that are globally administered by a security networking admin that can then gain control on how these applications are communicating Now, it provides you a single place in which you can understand what are the communication rules and what are the security policies for those applications, without having these dev ops having to care about what that connectivity or what those rules are So to recap, when you are deploying your applications as part of your GKE deployment, it is very important that your developer should be able to autonomously develop and be able to test and deploy and operate their applications without stepping on each other’s toes Now, the developers are really good at doing their job They’re really good at having their applications develop, testing them, making sure that they perform They should not be responsible of having to set up the networking, the underlying networking connectivity to connect the others They should not have to be responsible of how do I now make my connectivity available to this other application that I need to communicate with That is where the shared VPC comes in place It gives us security in networking I mean, not only a place to control that is very similar to the traditional enterprise models, but it gives you a way to enable your developers for really setting up for them those communication and those security policy rules so that it will make their life really easy for them And just to recap, we have seen tremendous traction on adopting these enterprise models

We’ve got a lot of feedback from lots of enterprises in which the shared VPC model is really the choice for their large enterprise deployments It’s starting to be able to achieve an operational model that doesn’t come up with the trade-offs that we have seen in the past And as well, we have seen the customers really acknowledging the fact that we are providing the networking capabilities when it comes to performance on a scale to Kubernetes deployments natively, the same way that we are doing for the rest of the VM deployments or any other type of deployments on our platform So we really are going to a model in which Kubernetes and the container deployments are first-class citizens for everything on networking, which will allow our customers to get advantage they want from the performance, the global scale, the global reachability that Google network provides With that, I will wrap up here Stay tuned for live Q&A because we are going to be back in less than a minute Thank you WESTON HUTCHINS: All right, we’re back And we’ll take some questions that we had during the session So the first question is, what are the differences between multi-zone and regional clusters? So before we launched regional clusters, we had this concept called multi-zone clusters And what it allowed you to do is take a zonal control plan that ran in a single zone and then create node pools in different zones So if you had a cluster that you wanted to create in US Central 1A, you could then add a node pool that was in US Central IB or 1C or wherever you wanted to go and create those With regional clusters, we give you that functionality automatically out of the box So when you go and create a cluster, it automatically spreads your nodes across the different zones for you The other major difference is that we replicate the control plane So we spread the masters across three different zones in a region and we replicate etcd between those So multi-zone was kind of the old way of doing it for zonal clusters And it’s still really good for test clusters or development clusters where you just want to spin it up and add some more capacity to try out service discovery across different zones Regional clusters is really what you want to be using for the increased availability and zero downtime that you get for running production-grade applications All right, so the next question is, do I have to spread my nodes across three zones in a region? And I think this is probably specifically talking about regional clusters I mentioned that we automatically create node pools in three zones in a particular region, but I do want to point out that that’s not a requirement So you can control both where the nodes are created and also how many nodes in each of those zones are created So for example, by passing in the flag, there’s a node location flag that you can use to actually create a regional cluster that is actually in a single zone So you might want to do this in order to save egress cross between different zones if you’re not really sure about how your cluster services are communicating with one another, or you just want to keep everything local to a zone, but you want to have the additional availability of a regional control plane So it’s really easy to mix and match those two just by specifying the node locations flag And then if you want to, you can even

specify the number of nodes flag, which allows you to say whether or not you want 1, 2, 3, n number of nodes created in each particular zone INES ENVID: All right, so I’ll take the next question When should I use shared VPC for my GKE deployments? So typically, you want to use shared VPC as long as you have administrative separations that you want to provide autonomy And that could be when you have, let’s say, a department A, department B, that are running different applications, or even to the granularity of a specific application developer So you want to have application A and application B separating their own projects You can actually partition your deployments to the granularity that it suits you The shared VPC is going to allow you to communicate really those applications that you are separating for administrative reasons So we see different use cases, from having two or three big departments, or maybe an environment that is test, stage, and production that are separated and communicated through a shared VPC Or we see to the granularity of different developers that are managing different applications being separated and connected through a shared VPC So every time that you want to provide network communication that is private, that is provided to your developers as a service that they have available to them– that’s where they need to connect– and you want to keep them isolated, in the sense of administratively for having their own billing quota and IM so that they are owners of their own resources, that’s where your shared VPC models come in place I’ll take the next question as well If we expose services inside VPC, can they just stay internal to the enterprise? Absolutely So this is actually one of the most prevalent models in enterprise deployment There is the capability to have your– through the configuration of the [? IS ?] IP that we just talked about, you configure your infrastructure VMs You configure your pods and your services as private IPs, configured as part of the same VPC subnet And then you decide which services are going to be exposed Now, the fact that you’re exposing those services, it totally doesn’t need to mean that you are exposing them publicly You can expose them within your VPC And because you now have a shared VPC, really you have a model in which those services communicate privately to your enterprise across your different cluster deployments And the last one, why are native VPC clusters more secure? One of the big things with enabling with native VPC clusters is the fact that we, as a networking stack, we understand where the pods and where each of their services are really located as part of our infrastructure So what it means is that we are able to understand that applications are hosted on a specific node, on a specific host of our infrastructure In doing so, we are then able to apply security checks, like IP spoofing And when we’re actually allowing traffic to a specific node, we are allowing only traffic for applications that having configure as part of that node So if I have, let’s say, pod one with IP, then I’m going to make a check that the traffic [? destined ?] to that node is [? destined ?] to a pod IP that it’s hosting there And if not, I’m not going to allow that traffic to come in So we really natively ensure that the traffic is legit when they’re communicating through our infrastructure because we understand where applications are hosting WESTON HUTCHINS: If you have any other questions, we’ll be online afterwards to help answer them You can use the platform, type in your questions, and we’ll jump on with any other thing that you have Now, stay tuned for the next session, coming live from Sunnyvale, “Why APIs are products, not one-off projects.” Thank you very much [MUSIC PLAYING]