Kubernetes by Google


Kubernetes is a system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications. Its APIs are intended to serve as the foundation for an open ecosystem of tools, automation systems, and higher-level API layers.

Kubernetes uses Docker to package, instantiate, and run containerized applications. While Docker itself works with individual containers, Kubernetes provides higher-level organizational constructs in support of common cluster-level usage patterns, currently focused on service applications, but which could also be expanded to batch and test workloads in the future.

It is primarily targeted at applications comprised of multiple containers, such as elastic, distributed micro-services. It is also designed to facilitate migration of non-containerized application stacks to Kubernetes. It therefore includes abstractions for grouping containers in both loosely coupled and tightly coupled formations, and provides ways for containers to find and communicate with each other in relatively familiar ways.

Kubernetes enables users to ask a cluster to run a set of containers. The system automatically chooses hosts to run those containers on. Kubernetes is intended to run on multiple cloud providers, as well as on physical hosts. A single Kubernetes cluster is not intended to span multiple availability zones. Instead, is recommended to build a higher-level layer to replicate complete deployments of highly available applications across multiple zones. Kubernetes is not currently suitable for use by multiple users.

Through the Open Source Container Management Project Kubernetes, Google is continuously working with companies such as: IBM, HP, Red Hat, Mesosphere, Windows Azure, CoreOS and VMware; to make sure that Kubernetes works well for everyone. Kubernetes can currently run on:

  • Google Compute Engine (GCE)
  • Vagrant Fedora (Ansible)
  • Fedora (Manual)
  • Locally
  • Microsoft Azure
  • Rackspace
  • CoreOS
  • VMware vSphere

Key Concepts

A pod is a relatively tightly coupled group of containers that are scheduled onto the same host. It models an application-specific “virtual host” in a containerized environment. Pods serve as units of scheduling, deployment, and horizontal scaling/replication, share fate, and share some resources, such as storage volumes and IP addresses.

Labels are used to specify identifying metadata, and to convey the semantic purposes/roles of pods of containers. Examples of typical pod label keys include service, environment (e.g., with values dev, qa, or production), tier (e.g., with values frontend or backend), and track (e.g., with values daily or weekly), but you are free to develop your own conventions.

Architecture of the System

We’ll break it down to services that run on the worker node and services that comprise the cluster-level control plane. The Kubernetes node has the services necessary to run Docker containers and be managed from the master systems. The second component on the node is called the kubelet. The Kubelet is the logical successor of the Container Agent that is part of the Compute Engine image.

Each node also runs a simple network proxy. This reflects services as defined in the Kubernetes API on each node and can do simple TCP and UDP stream forwarding (round robin) across a set of backends.

The Kubernetes control plane is split into a set of components, but they all run on a single master node. These work together to provide a unified view of the cluster.

All persistent master state is stored in an instance of etcd. This provides a great way to store configuration data reliably. With watch support, coordinating components can be notified very quickly of changes.

API Server validates and configures data for 3 types of objects: pods, services, and replicationControllers.

The scripts and data in the cluster/ directory automates creating a set of Google Compute Engine VMs and installing all of the Kubernetes components. There is a single master node and a set of worker (called minion) nodes.config-default.sh has a set of tweakable definitions/parameters for the cluster.

The heavy lifting of configuring the VMs is done by SaltStack.

As there is no security currently built into the apiserver, the salt configuration will install nginx. nginx is configured to serve HTTPS with a self signed certificate. HTTP basic auth is used from the client to nginx. nginx then forwards the request on to the apiserver over plain old HTTP. As part of cluster spin up, ssh is used to download both the public cert for the server and a client cert pair. These are used for mutual authentication to nginx. All communication within the cluster (worker nodes to the master, for instance) occurs on the internal virtual network and should be safe from eavesdropping.

The password is generated randomly as part of the kube-up.sh script and stored in ~/.kubernetes_auth.