Develop on Kubernetes Series —Operator Dev — The Introduction
Hi everyone! Containers are being picked up almost everywhere by almost everyone for deploying their production workloads, especially for applications backed by service-oriented architecture (which is massive). In turn, Kubernetes is being picked up to an insane extent for the sake of automating the management of such containerized workloads.
And one of the major reasons why Kubernetes is so actively being picked up is because how pluggable it is. By pluggable, I mean that it is not strict with its features and capabilities but actually, extremely extensible. You can plug your Kubernetes clusters with custom functionalities, coded by you or someone else, in a reliable and well-designed manner without the hassle of writing any bloatware or any piece of code NOT associated with the core functionality. These custom functionalities in Kubernetes’s lingo are called Operators.
I had my personal struggles in the past learning how to write operators because I found the official docs to be quite scattered and plunging directly into the practical aspects like SDKs, client libraries, etc. wasn’t the right way. I felt a lack of sense of direction around learning operator dev and getting motivated from that struggle, I am writing the next set of few articles for you folks to explore Kubernetes Operator Dev the right way, starting from the theoretical introductory standpoint to diving smoothly into the practical aspects of it by coding your own operator.
And in this first article of this series, I am going to talk about the intro, motivation and the mechanics behind operator dev.
So let’s dive in!
Kubernetes is a container orchestration tool which aims towards automating the scheduling and management of your workloads deployed in the form of containers. This is the shortest definition of Kubernetes I could come up with :P. For a more detailed one, do checkout https://kubernetes.io/
Honest suggestion: Please be a little familiarised with the basic (only fundamental) concepts of Kubernetes like Pods, Deployments, Kubernetes Architecture, etc. to make it easier for you to understand rest of this article series.
Controllers in Kubernetes
Kubernetes, inherently as a system, is declarative in nature, meaning that it is provided with a desired state/specification by the user and it tries to act upon the current reality to match the desired state. And this hard work of reconciling the drift between the current state and the desired state is done by components called Controllers.
These controllers have “Reconciliation loops” (also called “Control loops”) which constantly run and look at the current state of the cluster and the desired state and whenever they encounter a drift between them, they try to take certain actions to bring back current state to the desired state.
For example: I hope you are aware of the Kubernetes resource called
Deployment . One of its key features is to run your workload/application over a specified number of Pods (replicas) and it ensures to always keep on running the specified number of Pods. So, say a Deployment is supposed to run 5 replicas under its umbrella and say you delete two of its Pods. As a part of the reconciliation loop, the
Deployment Controller is going to notice this drift between the current state (3 pods) and desired state (5 pods), and it is going to take the action of creating two new pods (because 3+2 = 5) to match the current state to the desired state.
You can even code your own custom controllers which take certain actions upon certain changes in the cluster to accomplish certain purposes. For example, you can stumble across a situation where your cluster is so critical that you want to be notified whenever a Pod is created, updated or deleted. So, if you think like a controller, current state can be a Pod being created, updated or deleted, and desired state can be raising a slack notification to you.
And you can easily program it, package it as an image and deploy it over your cluster to do the above job :)
You might be aware of resources like
ConfigMapetc. These are native resources of Kubernetes. But you can even have custom resources like, say
MyCustomConfigMap with the “group” called
demo.yash.com with version
v1beta1 . Meaning you can literally
kubectl create a resource like the following one:
But how can you define this custom resource in the first place in your Kubernetes cluster?
Well enter Custom Resource Definitions!
Custom Resource Definitions
A Custom Resource Definition (aka CRD) is analogous to the schema of a custom resource in your cluster. With it, you actually define and tell your Kubernetes cluster how your custom resource will look, what name will it have, etc.
Once you apply this YAML manifest, your Kubernetes cluster will become aware of the
MyCustomConfigMap resource and know what structure of YAML to expect whenever a user would try to apply a YAML of
kind: MyCustomConfigMap .
Operators in Kubernetes
Kubernetes has a plethora of features but still, many a times while managing a real system, one might end up expecting Kubernetes to manage something on their cluster in a very custom stateful manner. And for obvious reasons, Kubernetes won’t have that feature. But Kubernetes does offer the capability to code a custom add-on on your own which can easily inject and reliably deploy in your cluster and start it to serve your custom use-case around watching some events happening in your cluster and taking actions accordingly.
Your operator is just like a controller in the sense that it watches for events (such as creations, updations, deletions) happening with your custom resources and takes actions as per its functionality accordingly.
Difference between a controller and an operator? A controller watches and is associated with native Kubernetes resources. Whereas an operator watches and is associated with a custom resource.
This basically resembles an actual SRE or a human-operator and automating a manual repetitive aspect of their job which might as well have originally included a lot of toil and susceptibility to human-prone errors.
Let’s understand with an example
Say, I define a custom resource named “postgres-writer” and kind “PostgresWriter” on my cluster.
And say, I
kubectl apply the following YAML manifest.
What I expect to happen is:
- This YAML will be successfully applied.
- Once it gets applied, then behind the scenes, automatically a new row will be inserted in my Postgres DB (sitting remotely far away) in the table
studentsand the inserted row will have the values
To make this happen, I’ll apply a CRD to make my cluster aware of this new custom resource
postgres-writer and its structure (the
Then, I will package and deploy an operator in my cluster which will watch the creation of this
PostgresWriterresource and perform the real magic of writing a new row (in near real-time) in my Postgres DB as soon as it captures a new
PostgresWriter resource being created.
To ensure the unique-ness with each inserted row, the operator will be writing the primary key of each row as <namespace in which resource is being created>/<name of resource> because the namespace-name combination corresponding to a certain resource kind in Kubernetes is always unique.
The URL and the credentials to access my PostgresDB will be configured with the operator at the time of deploying it, in the form of environment variables.
In the next article…
In the next part of this article series, I will be taking you over around making use of kubebuilder to write your own
postgres-writer operator from developing it and running it locally.
Thanks a lot for reaching till here. I hope you understood this article. In case of any doubts, feel free to raise it as a comment or reach out to me :)
Twitter — https://twitter.com/yashkukreja98
Until next time, Adios!