“Service” is one of the most powerful and, as a result, complex abstractions in Kubernetes. It is, also, a very heavily overloaded term which makes it even more confusing for people approaching Kubernetes for the first time. This chapter will provide a high-level overview of different types of Services, their goals and how they relate to other cluster elements and APIs.
A lot of ideas and concepts in this chapter are based on numerous talks and presentations on this topic. It’s difficult to make concrete attributions, however most credit goes to members of Network Special Interest Group.
A good starting point to understand a Kubernetes Service is to think of it as a distributed load-balancer. Similar to traditional load-balancers, its data model can be reduced to the following two components:
All Services implement the above functionality, but each in its own way designed for its unique use case. In order to understand various Service types, it helps to view them as an “hierarchy” – starting from the simplest, with each subsequent type building on top of the previous one. The table below is an attempt to explore and explain this hierarchy:
|Headless||The simplest form of load-balancing involving only DNS. Nothing is programmed in the data plane and no load-balancer VIP is assigned, however DNS query will return IPs for all backend Pods. The most typical use-case for this is stateful workloads (e.g. databases), where clients need stable and predictable DNS name and can handle the loss of connectivity and failover on their own.|
|ClusterIP||The most common type, assigns a unique ClusterIP (VIP) to a set of matching backend Pods. DNS lookup of a Service name returns the allocated ClusterIP. All ClusterIPs are configured in the data plane of each node as DNAT rules – destination ClusterIP is translated to one of the PodIPs. These NAT translations always happen on the egress (client-side) node which means that Node-to-Pod reachability must be provided externally (by a CNI plugin).|
|NodePort||Builds on top of the ClusterIP Service by allocating a unique static port in the root network namespace of each Node and mapping it (via Port Translation) to the port exposed by the backend Pods. The incoming traffic can hit any cluster Node and, as long as the destination port matches the NodePort, it will get forwarded to one of the healthy backend Pods.|
|LoadBalancer||Attracts external user traffic to a Kubernetes cluster. Each LoadBalancer Service instance is assigned a unique, externally routable IP address which is advertised to the underlying physical network via BGP or gratuitous ARP. This Service type is implemented outside of the main kube controller – either by the underlying cloud as an external L4 load-balancer or with a cluster add-on like MetalLB, Porter or kube-vip.|
One Service type that doesn’t fit with the rest is
ExternalName. It instructs DNS cluster add-on (e.g. CoreDNS) to respond with a CNAME, redirecting all queries for this service’s domain name to an external FQDN, which can simplify interacting with external services (for more details see the Design Spec).
The following diagram illustrates how different Service types can be combined to expose a stateful application:
Although not directly connected, most Services rely on Deployments and StatefulSets to create the required number of Pods with a unique set of labels.
Services have a relatively small and simple API. At the very least they expect the following to be defined:
kind: Service apiVersion: v1 metadata: name: service-example spec: ports: - name: http port: 80 targetPort: 80 selector: app: nginx type: LoadBalancer
Some services may not have any label selectors, in which case the list of backend Pods can still be constructed manually. This is often used to interconnect with services outside of the Kubernetes cluster while still relying on internal mechanisms of service discovery.
Service’s internal architecture consists of two loosely-coupled components:
kube-controller-managerbinary, that reacts to API events and builds an internal representation of each service instance. This internal representation is a special Endpoints object that gets created for every Service instance and contains a list of healthy backend endpoints (PodIP + port).
kube-proxywith various competing implementations from 3rd-party Kubernetes networking providers like Cilium, Calico, kube-router and others.
Another less critical, but nonetheless important components is DNS. Internally, DNS add-on is just a Pod running in a cluster that caches
Endpoints objects and responds to incoming queries according to the DNS-Based Service Discovery specification, which defines the format for incoming queries and the expected structure for responses.