I haven’t blogged here for over 2 years. It’s not that I had nothing to say, but every time I started writing a new post I never pushed myself into finishing it. So, most of the drafts ended up rotting in my private Github gists. Although my interests have expanded way beyond the Linux container space, my professional life remained tied to it.
Over the past two years I have been quite heavily involved in Kubernetes (K8s) community. I helped to start Kubernetes London Meetup as well as Kubecast, a podcast about all things K8s. It’s been amazing to see the community [not only in London] to grow so fast in such a short time.
More and more companies are jumping on board to orchestrate their container deployments to address the container Jevons paradox. This is great for the project, but it’s not a free lunch for the newcomers. New and shiny things often make the newcomers anxious. Especially when there is a lot of new concepts to grasp to become fully productive. Changing the mindset is often the hardest thing to do.
Over the past few months I have been noticing one thing in particular. K8s abstracts a lot of infra through its API. This is similar in other modern platforms like Cloud Foundry and the likes. Hiding things away makes “traditional” Ops teams feel uneasy. The idea of “not caring” what’s going on underneath the K8s roof is unsettling. This is hardly surprising, as most of us polished our professional skills whilst debugging all kinds of crazy OS and hardware issues (hello Linux on desktop!); we naturally tend to dig deep when new tech comes up. It’s good to be prepared when a disaster strikes.
Quite a few peolpe have asked me recently, both in person and via Twitter DMs, about what’s going on underneath the K8s when a HTTP request arrives in the cluster from the external traffic i.e. traffic from outside the K8 cluster. People wanted to know how do all the pieces such as service, ingress and DNS work together inside the cluster. If you are one of the curious folks, this post might be for you. We will put the service requests through X-ray!
Cluster Setup
This post assumes we run our own “bare metal” standalone K8s installation. If you don’t know how to get K8s running in your own infrastructure, check out the Kubernetes the hard way guide by Kelsey Hightower which can be easily translated into your own environment.
We will assume we have both the control plane and 3 worker nodes up and running:
$ kubectl get componentstatuses
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
$ kubectl get nodes
NAME STATUS AGE
worker0 Ready 18m
worker1 Ready 16m
worker2 Ready 16m
We will also assume that we have DNS and K8s dashboard add-ons set up in the cluster. As for DNS, we will use the off-the shelf, kube-dns
. [Remember, the add-ons run as services in kube-system
namespace] :
$ kubectl get svc --namespace=kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 10.32.0.10 <none> 53/UDP,53/TCP 21m
kubernetes-dashboard 10.32.0.148 <nodes> 80:31711/TCP 1m
No services [and thus no pods] are running in the default
K8s space except for the kubernetes
API service:
$ kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.32.0.1 <none> 443/TCP 5h
$ kubectl get pods
No resources found.
Cluster state and configuration
K8s cluster stores all of its internal state in etcd
cluster. The idea is, that you should interact with K8s only via its API provided by API service. API service abstracts away all the K8s cluster state manipulating by reading from and writing into the etcd
cluster. Let’s explore what’s stored in the etcd
cluster after fresh installation:
$ etcdctl --ca-file=/etc/etcd/ca.pem ls
/registry
/registry
key is where all the magic happens in K8s. If you are familiar with K8s at least a bit, listing the contents of this key will reveal a tree structure referencing keys with names of familiar K8s concepts:
$ etcdctl --ca-file=/etc/etcd/ca.pem ls registry
/registry/services
/registry/events
/registry/secrets
/registry/minions
/registry/deployments
/registry/clusterroles
/registry/ranges
/registry/namespaces
/registry/replicasets
/registry/pods
/registry/clusterrolebindings
/registry/serviceaccounts
Let’s have a look what’s hiding underneath /registry/services
key, which is what we are interested in in this blog post. We will list the key services
key space recursively i.e. sort of like when you run ls -lR
on your command line:
~$ etcdctl --ca-file=/etc/etcd/ca.pem ls /registry/services --recursive
/registry/services/specs
/registry/services/specs/default
/registry/services/specs/default/kubernetes
/registry/services/specs/kube-system
/registry/services/specs/kube-system/kubernetes-dashboard
/registry/services/specs/kube-system/kube-dns
/registry/services/endpoints
/registry/services/endpoints/default
/registry/services/endpoints/default/kubernetes
/registry/services/endpoints/kube-system
/registry/services/endpoints/kube-system/kube-scheduler
/registry/services/endpoints/kube-system/kube-controller-manager
/registry/services/endpoints/kube-system/kube-dns
/registry/services/endpoints/kube-system/kubernetes-dashboard
Output of this command has uncovered a wealth of the information. For starters, we can see the two service namespace
s: default
and kube-system
under the specs
key. We can assume that particular K8s service configuration is stored in the values stored under the keys named after the particular service names.
Another important key in the output above is endpoint
. I’ve noticed in the community that not a lot of people are familiar with K8s endpoints
API resource. This is because normally, you don’t need to interact with it directly. At least not when doing the usual K8s work like deploying and managing apps via kubectl
. But you do need to be familiar with it when debugging malfunctioning services or building ingress controllers or custom loadbalancers.
Endpoints are a crucial concept for K8s services. They represent a list of IP:PORT
mappings created automatically (unless you are using headless services) when you create a new K8s service. K8s sercice selects particular set of pods and maps them into endpoints.
In the context of K8s service, endpoints are basically service traffic routes. K8s service must keep an eye on its endpoints at all times. K8s service watches particular endpoints
key which notifies it in case some pods in its list have been terminated or rescheduled on another host (in this case it most likely gets a new IP:PORT
allocation). Service then routes the traffic to the new endpoint instead of the old [dead] one. In other words, K8s services are K8s API watchers.
In our cluster we only have kubernetes
service running right now. It is running in the default
namespace. Let’s check its endpoints using kubectl
:
$ kubectl get endpoints kubernetes
NAME ENDPOINTS AGE
kubernetes 10.240.0.10:6443,10.240.0.11:6443,10.240.0.12:6443 7h
We can find the same list of endpoints if we display the contents of particular etcd
key:
$ etcdctl --ca-file=/etc/etcd/ca.pem get /registry/services/endpoints/default/kubernetes
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "kubernetes",
"namespace": "default",
"selfLink": "/api/v1/namespaces/default/endpoints/kubernetes",
"uid": "918dc93c-e61c-11e6-8119-42010af00015",
"creationTimestamp": "2017-01-29T12:15:05Z"
},
"subsets": [{
"addresses": [{
"ip": "10.240.0.10"
}, {
"ip": "10.240.0.11"
}, {
"ip": "10.240.0.12"
}],
"ports": [{
"name": "https",
"port": 6443,
"protocol": "TCP"
}]
}]
}
Now that we have scrutinized K8s services a bit, let’s move on and create our own K8s service and try to route some traffic to it from outside the cluster.
Services, kube-proxy and kube-dns
We will create a simple service which will run two replicas of nginx
and we will scrutinize the request flow within the K8s cluster. The following command will create a K8s deployment of 2 replicas of nginx
servers running in separate pods:
$ kubectl run nginx --image=nginx --replicas=2 --port=80
deployment "nginx" created
The command we ran actually created K8s deployment which bundles K8s replica set that consists of two pods. We can verify that very easily:
$ kubectl get deployments,rs,pods
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/nginx 2 2 2 2 17s
NAME DESIRED CURRENT READY AGE
rs/nginx-3449338310 2 2 2 17s
NAME READY STATUS RESTARTS AGE
po/nginx-3449338310-140d3 1/1 Running 0 17s
po/nginx-3449338310-qc77h 1/1 Running 0 17s
We can also check that the K8s cluster keeps the track of the new pods in etcd
under the pods
key:
$ etcdctl --ca-file=/etc/etcd/ca.pem ls /registry/pods/default --recursive
/registry/pods/default/nginx-3449338310-qc77h
/registry/pods/default/nginx-3449338310-140d3
We could equally check the contents of /registry/deployments
and /registry/replicasets
keys, but let’s pass that for now. The next step is to turn the nginx
deployment into a service. We will call it nginx-svc
and expose it on port 8080
inside the cluster:
$ kubectl expose deployment nginx --port=8080 --target-port=80 --name=nginx-svc
service "nginx-svc" exposed
This will, as expected, create a new K8s service which has two endpoints:
$ kubectl get svc,endpoints
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kubernetes 10.32.0.1 <none> 443/TCP 9h
svc/nginx-svc 10.32.0.191 <none> 8080/TCP 26m
NAME ENDPOINTS AGE
ep/kubernetes 10.240.0.10:6443,10.240.0.11:6443,10.240.0.12:6443 9h
ep/nginx-svc 10.200.1.5:80,10.200.2.4:80 26m
We could also query etcd
and see that K8s API service has taken care of creating particular service
and endpoints
keys and populated them with correct information about connection mappings.
Now, here comes the first newcomer “gotcha”. When the service is created it is assigned a Virtual IP (VIP). Many people often try to ping the IP and fail miserably. This leads them to do all kinds of debugging until they get frustrated and give up. Service VIP address is only really useful in combination with the service port (we will get back to this later on in the post). Pinging service VIP gives you no luck. However, when accessing the service endpoints from any pod in the cluster, you are perfectly fine. We will see that later on.
If you don’t specify a type
of service, K8s by default uses ClusterIP
option, which means that the new service is only exposed only within the cluster. It’s kind of like internal K8s service, so it’s not particularly useful if you want to accept external traffic:
$ kubectl describe svc nginx-svc
Name: nginx-svc
Namespace: default
Labels: run=nginx
Selector: run=nginx
Type: ClusterIP
IP: 10.32.0.191
Port: <unset> 8080/TCP
Endpoints: 10.200.1.5:80,10.200.2.4:80
Session Affinity: None
No events.
Our service consists of two nginx servers, so we can try to curl it from a pod that has curl installed. We can use the service VIP in combination with the exposed service port 8080
:
$ kubectl run -i --tty client --image=tutum/curl
root@client-1141742176-c2o9i:/# curl -I 10.32.0.191:8080
HTTP/1.1 200 OK
Server: nginx/1.11.9
Date: Mon, 30 Jan 2017 10:36:19 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 24 Jan 2017 14:02:19 GMT
Connection: keep-alive
ETag: "58875e6b-264"
Accept-Ranges: bytes
Let’s move to more interesting service exposure options now. If you want to expose your service to the outside world you can either use NodePort
type or LoadBalancer
. Let’s have a look at the NodePort
service first. We will delete the service we created earlier:
$ kubectl delete service nginx-svc
service "nginx-svc" deleted
$ kubectl delete deployment nginx
deployment "nginx" deleted
$ kubectl get pods,svc,deployments
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes 10.32.0.1 <none> 443/TCP 9h$
We will recreate the same service, but this time we will use NodePort
type option:
$ kubectl run nginx --image=nginx --replicas=2 --port=80
deployment "nginx" created
$ kubectl expose deployment nginx --port=8080 --target-port=80 --name=nginx-svc --type=NodePort
service "nginx-svc" exposed
$ kubectl get deployments,svc,pods
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/nginx 2 2 2 2 1m
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
svc/kubernetes 10.32.0.1 <none> 443/TCP 9h
svc/nginx-svc 10.32.0.156 <nodes> 8080:30299/TCP 53s
NAME READY STATUS RESTARTS AGE
po/nginx-3449338310-9ffk6 1/1 Running 0 1m
po/nginx-3449338310-wrp9x 1/1 Running 0 1m
NodePort
type, according to the documentation opens a service port on every worker node in K8s cluster. Now, here comes another newcomer gotcha. What a lot of people ask me is, “ok, but how come I can’t see the service port listening on any of the worker nodes?” Often, people simply run netstat -ntlp
and grep for the exposed service port; in our case that would be port 8080
. Well, bad news is, they won’t find any service listening on port 8080
. This is where the magic of kube-proxy
happens. Instead the service port is mapped to a different port on the node, NodePort
. You can find the NodePort
by describe
ing the service:
$ kubectl describe svc nginx-svc
Name: nginx-svc
Namespace: default
Labels: run=nginx
Selector: run=nginx
Type: NodePort
IP: 10.32.0.156
Port: <unset> 8080/TCP
NodePort: <unset> 30299/TCP
Endpoints: 10.200.1.6:80,10.200.2.6:80
Session Affinity: None
No events.
So if you now grep for processes listening on port 30299
on every worker node you should see kube-proxy
listening on the NodePort
. See the output below from one of the worker nodes:
containerops@worker0:~$ sudo netstat -ntlp|grep 30299
tcp6 0 0 :::30299 :::* LISTEN 16080/kube-proxy
Again, as for the service accessibility, nothing changes, service is perfectly accessible through VIP:PORT
combination as in the ClusterIP
case:
$ kubectl run -i --tty client --image=tutum/curl
root@client-1141742176-c2o9i:/# curl -I 10.32.0.156:8080
HTTP/1.1 200 OK
Server: nginx/1.11.9
Date: Mon, 30 Jan 2017 10:50:56 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 24 Jan 2017 14:02:19 GMT
Connection: keep-alive
ETag: "58875e6b-264"
Accept-Ranges: bytes
Now that you have a port open on every node you can configure your external load balancer or edge router to route the traffic to any of the K8s worker nodes on the NodePort
. Simples! And indeed this is what we had to do in past before ingress has been introduced.
The “problem” of NodePort
type is that the load balancer (or proxy) that routes the traffic to worker nodes will need to balance between the K8s cluster nodes, which in turn will load balance the traffic across pod endpoints. There is also no easy way of adding TLS or more sophisticated traffic routing. This is what the ingress API resource addresses, but let’s talk about kube-proxy
first as it’s the most crucial component with regards to K8s services and also a bit of source of confusion for the newcomers.
kube-proxy
kube-proxy
is a special daemon (application) running on every worker node. It can run in two modes [configuratble via --proxy-mode
command line switch]:
- userspace
- iptables
In the userspace mode, kube-proxy
is running as a userspace process i.e. regular application. It terminates all incoming service connections and creates a new connection to a particular service endpoint. The advantage of the userspace mode is that because the connections are created from userspace process, if the connection fails, kube-proxy
can retry to a different endpoint.
In iptables mode, the traffic routing is done entirely through kernelspace via quite complex iptables
kung-fu. Feel free to check the iptables
rules on each node. This is way more efficient than moving the packets from the kernel to userspace and then back to the kernel. So you get higher throughput and better latency. The downside is that the service can be more difficult to debug, because you need to inspect logs from iptables and maybe do some tcpdumping or what not.
The moral of the story is: there will always be a kube-proxy
running on worker nodes regardless of what mode it is running in. The difference is that in userspace mode it acts as a TCP proxy intercepting and forwarding traffic whilst in iptables mode it will configure iptables rather than proxy connections itself. The traffice forwarding is done by iptables
automagically.
kube-dns
Now, kube-proxy
is just one piece of the K8s service puzzle. Another one is kube-dns
which is responsible for DNS service discovery. If the kube-dns
add on has been set up properly you can access K8s services using their names directly. You don’t need to remember VIP:PORT
combination. The name of the service will suffice. How is this possible? Well, when you use kube-dns
, K8s “injects” certain nameservice lookup configuration into new pods that allows you to query the DNS records in the cluster. Let’s have a look at our familiar tutum/curl
pod we created to test services.
First, let’s check the DNS resolution configuration:
$ kubectl run -i --tty client --image=tutum/curl
root@client-1141742176-c2o9i:/# cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.kube-blog.internal
nameserver 10.32.0.10
options ndots:5
You can see that the IP address of the kube-dns
service (see at the top that this is the kube-dns
VIP) has been injected into the new pod along with some lookup domains. kube-dns
creates an internal cluster DNS zone which is used for DNS and service discovery. This means that we can access the services from inside the pods via the service names directly:
root@client-1141742176-c2o9i:/# curl -I nginx-svc:8080
HTTP/1.1 200 OK
Server: nginx/1.11.9
Date: Mon, 30 Jan 2017 11:05:40 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 24 Jan 2017 14:02:19 GMT
Connection: keep-alive
ETag: "58875e6b-264"
Accept-Ranges: bytes
Since our nginx-svc
has been created in the default
namespace (check the etcd
queries shown earlier in this post), we can access it using the cluster internal DNS name, too:
root@client-1141742176-c2o9i:/# curl -I nginx-svc.default.svc.cluster.local:8080
HTTP/1.1 200 OK
Server: nginx/1.11.9
Date: Mon, 30 Jan 2017 11:06:12 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 24 Jan 2017 14:02:19 GMT
Connection: keep-alive
ETag: "58875e6b-264"
Accept-Ranges: bytes
Ok, so this is really handy. No more remembering IP addresses, no more crafting and hacking our own /etc/hosts
files within the pods - this almost feels like “No-Traditional-Ops” (oops) ;-)
We won’t talk about LoadBalancer
type in this post as it’s only handy when running your cluster in one of the supported cloud providers and like I said, this post is about running K8s on bare metal deployment - we have no luxury of ELBs and the likes! Either way, now we should be well equipped to take the next step and look into the magic of ingress.
Services and ingresses
Let’s talk about Ingress
resource and how it can address the “shortcomings” of the NodePort
service type. Don’t forget to check the documentation about Ingress API resource. I will try to summarize the most important bits and show you how you how it works undernath.
Ingress is an API resource which represents a set of traffic routing rules that map external traffic to K8s services. Ingress allows external traffic to land in the cluster in a particular service. Ingress on its own is just one part of the puzzle. It merely creates the traffic route maps. We need one more piece to make this work: ingress controllers. Ingress controllers are responsible for the actual traffic routing. So we need to:
- Create Ingress (API object)
- Run Ingress controller
What we do in practice has actually the opposite order. First you create an ingress controller which handles the traffic and wait until it’s ready. Then you created and “open” the route in for the incoming traffic. This order makes sense: you need to have your traffic controller ready to handle the traffic before you “open the door”.
Ingress
Let’s create a simple ingress to route the traffic to our nginx-svc
service we created earlier. Before that we need to create a default backend. Default backend is a special service endpoint which will handle the traffic that arrives at the ingress and does not match any of the configured routes in the ingress route map. It is sort of like a default “fail over” host known from various application and http servers. We will use the default-http-backend
available in the official documentation. We will expose it as a new service:
$ kubectl create -f https://raw.githubusercontent.com/kubernetes/contrib/master/ingress/controllers/nginx/examples/default-backend.yaml
replicationcontroller "default-http-backend" created
$ kubectl expose rc default-http-backend --port=80 --target-port=8080 --name=default-http-backend
service "default-http-backend" exposed
$ kubectl get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default-http-backend 10.32.0.227 <none> 80/TCP 8s
kubernetes 10.32.0.1 <none> 443/TCP 23h
nginxsvc 10.32.0.156 <nodes> 8080:31826/TCP 10h
Now, before we create ingress
API object we need to create an ingress controller. We don’t want to be caught off the guard exposing the service before we are ready to handle it. You are spoilt for choice here. You can use the Rancher one or NGINX inc. one build your own more specialized controller. In this guide we will stick to the basic nginx ingress controller available in Kubernetes repo. So let’s create it now:
$ kubectl create -f https://raw.githubusercontent.com/kubernetes/contrib/master/ingress/controllers/nginx/examples/default/rc-default.yaml
replicationcontroller "nginx-ingress-controller" created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
default-http-backend-lb96h 1/1 Running 0 14m
nginx-3449338310-64h52 1/1 Running 0 1h
nginx-3449338310-8pcq5 1/1 Running 0 1h
nginx-ingress-controller-k3d2s 0/1 Running 0 18s
Notice that the nginx-controller
we have created is just a simple application running in a K8s pod which has some special powers as we will see later on. What ingress controllers do unerneath is, they first register themselves into the list of controllers via API service and store some configuration there. We can list the available controllers in the cluster by listing our familiar etcd
cluster registry. In this case we are interested in /registry/controllers/default
key (default
implies default K8s namespace):
$ etcdctl --ca-file=/etc/etcd/ca.pem ls /registry/controllers/default
/registry/controllers/default/default-http-backend
/registry/controllers/default/nginx-ingress-controller
Great, so both default-http-backend
and nginx-ingress-controller
have registered themselves correctly. We should be ready to create some ingress rules and bring the external traffic into the cluster now. For the purpose of this post I will use the following ingress:
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: my-nginx-ingress
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: foobar.com
http:
paths:
- path:
backend:
serviceName: nginx-svc
servicePort: 8080
What this will do is, it will create an ingress API resource which will map all the incoming requests which have HTTP Host header set to foobar.com into our nginx-svc
service. All the other requests coming to this ingress point will be routed to the default-http-backend
. Please note that we will be mapping only the root URL, but you have the option to create maps to particular URL endpoints. Let’s go ahead create the ingress now:
$ kubectl create -f foobar.yaml
ingress "my-nginx-ingress" created
$ kubectl get ing -o wide
NAME HOSTS ADDRESS PORTS AGE
my-nginx-ingress foobar.com X.X.X.X 80 21s
Excellent! The ingress has now been created as per the specified yaml
config shown earlier. As is always the case, every ingress
stores its configuration in the etcd
cluster. So let’s have a look there:
$ etcdctl --ca-file=/etc/etcd/ca.pem ls /registry/ingress/default
/registry/ingress/default/my-nginx-ingress
Let’s see what the contents of the /registry/ingress/default/my-nginx-ingress
key is:
{
"kind": "Ingress",
"apiVersion": "extensions/v1beta1",
"metadata": {
"name": "my-nginx-ingress",
"namespace": "default",
"selfLink": "/apis/extensions/v1beta1/namespaces/default/ingresses/my-nginx-ingress",
"uid": "f2e57376-e6e5-11e6-9d09-42010af0000c",
"generation": 1,
"creationTimestamp": "2017-01-30T12:16:37Z",
"annotations": {
"kubernetes.io/ingress.class": "nginx"
}
},
"spec": {
"rules": [{
"host": "foobar.com",
"http": {
"paths": [{
"backend": {
"serviceName": "nginx-svc",
"servicePort": 8080
}
}]
}
}]
},
"status": {
"loadBalancer": {
"ingress": [{
"ip": "X.X.X.X"
}]
}
}
}
We can see that the ingress maps all the traffic for the foobar.com
into our nginx-svc
service as expected and it is available through an external IP address which has been redacted in this post. This is the IP to which you would point your DNS records and which routes to the ingress controller IP address exposed externally.
Why is this etcd
key so important? Well, the ingress controllers are actually K8s API watcher applications which watch particular /registry/ingress
endpoints changes to keep an eye on particular K8s service endpoints. That was a mouthful! The key takeaway here is: ingress controller monitor service endpoints eg. Ingress controllers don’t route traffic to the service, but rather to the actual service endpoints i.e. pods. There is a good reason for this behavior.
Imagine one of your service pod dies. Until K8s service notices it’s dead it won’t remove it from it’s list of endpoints. K8s service is like we should know by now, just another API watcher which simply watches /registry/endpoints
. This endpoint is updated by K8s controller-manager
. Even if the K8s controller-manager
does pick up the endpoints change, there is no guarantee kube-proxy
has picked it up and updated the iptables
rules accordingly - kube-proxy
can still route the traffic to the dead pods. So it’s safer for the ingress controllers to watch the endpoints themselves and update their routing tables as soon as controller-manager
updates the list of endpoints.
Now, what is all this buzz around ingress controllers? Well, for starters they can terminate the traffic and load balance it across service endpoints. The load balancing can be quite sophisticated and it’s entirely up to the ingress controller designed. Finally, you can have them doing the SSL/TLS termination heavy lifting and relieve the actual services of doing so. You can indeed get the TLS configuration done through the secrets API.
Now that we have both Ingress and Ingress controllers in place we should be able to curl
our nginxsvc
directly from outside the cluster as long as we set foobar.com HTTP Header. Let’s try to do that:
$ curl -I -H "Host: foobar.com" X.X.X.X
HTTP/1.1 200 OK
Server: nginx/1.11.3
Date: Sat, 28 Jan 2017 00:41:51 GMT
Content-Type: text/html
Content-Length: 612
Connection: keep-alive
Last-Modified: Tue, 24 Jan 2017 14:02:19 GMT
ETag: "58875e6b-264"
Accept-Ranges: bytes
We could easily verify that the traffic is being routed to particular pods by tailing the logs of the ingress controller and logs of the nginxsvc
pods. I will leave that as an exercise to you :-)
Finally we should have a better mental model about how all the pieces come together when a request arrives into the K8s cluster through ingress [controller]:
(traffic from outside K8s cluster) -> (ingress-controller) -> (kube-dns lookup for particular service) -> (kube-proxy/iptables) -> service endpoint (pod)
Conclusion
Thanks for staying with me until the end! Let’s quickly summarize what we learnt in this post. Everything what’s happening in the K8s cluster is done through API service. API resources are implemented as API watchers that watch particular set of keys by registering watches in K8s API service. K8s API service in turn reads from and writes into etcd
cluster which stores all of the cluster internal state.
K8s service Traffic is routed directly into service’s pods, which are service’s endpoints, via a sophisticated iptables
kung-fu performed by kube-proxy
or via the kube-proxy
itself. The service name addressing is done through DNS discovery done by DNS add-on; in the simplest case this is kube-dns
, but you are spoilt for choice, so pick one that suits you the best.
Ingress is an API resource which allows to map network traffic to ingress controllers. Ingress controllers are applications deployed in pods that work as K8s API watchers monitoring /registry/ingress
key through K8S API service and update their routes based on the service endpoint changes.