↜ back to home

What I learnt about Kubernetes Controllers

If you are a Kubernetes Controller you know that your main duty is to react to changes to the world’s desired state and actual state to do whatever you can to update the latter so that it matches the former.

When I think about my early steps with Kubernetes two things comes to mind related to Controllers:

Said that, I was sure I knew everything about controllers myself but just realized I never had the opportunity to learn what they actually do underneath until recently when I had a peak of interest after opening issue #67342 titled “Storage: devicePath is empty while WaitForAttach in StatefulSets”.

While trying to reproduce, I encountered a set of call to functions that were happening through some files named with very explanatory names:

actual_state_of_world.go:616 ->
  reconciler.go:238 ->
    operation_executor.go:712 ->
       operation_generator.go:437 -> error on line 496

This looked very similar to a definition I found in the “Standardized Glossary” here.

A control loop that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired state. Examples of controllers that ship with Kubernetes today are the replication controller, endpoints controller, namespace controller, and serviceaccounts controller.

Nice so the VolumeManager is not really a controller but conceptually it behaves in a very similar way since it has a loop, a reconciler, a desired state and an actual state.

At this point I started looking at all the projects both private and public I touched and among the public ones I recognized a very interesting pattern they all had a cache.ListWatch and a cache.SharedInformer.

The interesting part was that most of them also had a workqueue.Interface like the etcd operator controller, the NGINX ingress controller and the OpenFaas Operator Controller and it turns out that they use it because it’s a key component in ensuring that the state is consistent and that all the controller’s instances agree on a shared set of elements to be processed with certain constraints (this looks very close to that Glossary’s definition above!).

While writing this post I was tempted to write a full length example but I found an already available exhaustive example in the kubernetes repo so i will just write and go through the simplest self-contained example I can write.

Scroll to the end of the controller example to read about it.

The main components of this Controller are:

In the SharedInformer I define some handlers to deal with Add, Delete and Update but instead of using them directly I synchronize what they receive into a workqueue with queue.Add

AddFunc: func(obj interface{}) {
   key, err := cache.MetaNamespaceKeyFunc(obj)
   if err == nil {
    queue.Add(key)
   }
 },

The Workqueue is a structure that allows to queue changes for a specific resource and process them later in multiple workers with the guarantee that there will be no more than one worker working on a specific item at the same moment. In fact the elements are processed in runWorker and multiple workers are started by increasing the threadiness parameter of the Controller’s Run method.

In this way I can end up in syncToStdout and be sure I will be the only one processing that item while knowing that if the current process gives an error my operation will be repeated up to an hardcoded limit of 5 times as defined in handleErr.

In this situation every item has an exponential backoff rate limit so that failures are not retried immediately but after a calculated amount of time that increases depending on the specified factor (I used DefaultControllerRateLimiter here but it’s very easy to create your own with chosen parameters).

This rate limit mechanism can be very helpful if we added a call to an external API every time we are informed about a pod. In such case the external API might impose a rate limit to our calls resulting in a failed behavior right now that will be perfectly fine after retrying in a while.

The Indexer and the Informer are also key components to use the process workqueue elements here because we want to be Informed about events occurring for the resources Kind we are interested in (in this case: Pod) and we want to have an Index where we can lookup for the final Pod object.

But hey, since we used the SharedInformer so we don’t need to provide an indexer ourselves because our beloved informer already contains one in GetStore().

Another aspect of using the SharedInformer here is that we are guaranteed that the element we get from its internal indexer is at least as fresh as the event we received.

Wow, I don’t think I know everything about controllers now but I’m still in the peak of interest so I will probably follow with more stuff on the topic.