Develop on Kubernetes Series— Operator Dev— Coding out the Custom Resource & Reconciliation Loop

Yashvardhan Kukreja
13 min readOct 17, 2021

Hi everyone! This is the third part to the N-part series of Kubernetes Operator Dev. In this article, we’re going to dive into exploring how to code a custom resource and the reconciliation loop of an operator/controller.

I’d strongly suggest you to quickly go through the previous part/article because it dives into kubebuilder which is a suggested pre-requisite to this article and also, in the previous part, I presented and bootstrapped an example operator called PostgresWriter and in this article, we’re going to code out that operator’s reconciliation loop.

But still, if you feel lazy, I’ll summarize the PostgresWriter operator for you in just the next section so that you end up with enough context to not depend on the previous article completely :)

So let’s begin :D

https://raw.githubusercontent.com/cncf/artwork/master/projects/kubernetes/stacked/all-blue-color/kubernetes-stacked-all-blue-color.png

Let’s setup some context

Feel absolutely free to skip this section if either you already know about Reconciliation Loops and the PostgresWriter custom resource/operator which I explained in the previous part, or you have already gone through the first part and second part of this article series.

Our example operator i.e. PostgresWriter

Again, Feel free to skip this if you already went through the second part (previous part) of this article series.

The operator which we will be building is going to be called “PostgresWriter”.

The idea is pretty straightforward. Say, we have a Postgres DB somewhere sitting at the corner of the world.

Our cluster would have a custom resource called “postgres-writer”.
And a manifest associated with “postgres-writer” resource would like this:

Whenever a manifest like this would be applied/created to a Kubernetes cluster, our operator is going to capture that event and do the following three things:

  • Parse the spec of the incoming “postgres-writer” resource being created and recognize the fields table , name , age and country .
  • Form a unique id corresponding to the above incoming “postgres-writer” resource in the format <namespace of the incoming postgres-writer resource>/<name of the incoming postgres-writer resource> ( default/sample-student in this case).
    Because in Kubernetes, namespace/name combination is always unique across the cluster for a certain resource kind (in our case PostgresWriter).
  • Insert a new row in our Postgres DB in the table namedspec.table and with the spec.name , spec.age and spec.country fields accordingly and primary key would be the above unique id (namespace/name of the incoming resource) which we formed.
High level flow of our operator and custom-resource in action
A slightly deeper flow of our operator in action

Also, whenever a PostgresWriterresource like the above is going to be deleted, our operator will accordingly DELETE the row corresponding to that resource from our PostgresDB so as to keep the rows of our PostgresDB and the PostgresWriter resources present on our cluster consistent with each other.

With respect to the above example, if we were to kubectl delete the above sample-studentPostgresWriter resource, then our operator would be deleting the row corresponding to the id default/sample-student as a consequence.

This will ensure that for every PostgresWriter resource in our cluster, there’s one row in our PostgresDB, nothing more, nothing less.

Reconciliation Loops

Kubernetes, inherently as a system, is declarative in nature, meaning that it is provided with a desired state/specification by the user and it tries to act upon the current reality to match the desired state. And this hard work of reconciling the drift between the current state and the desired state is done by components called Controllers.

These controllers have “Reconciliation loops” (also called “Control loops”) which constantly run and look at the current state of the cluster and the desired state and whenever they encounter a drift between them, they try to take certain actions to bring back current state to the desired state.

In our case, our operator is also a controller only, in the eyes of Kubernetes. It is observing a custom resource called PostgresWriter , captures the incoming resource spec (desired state) and does some magic (writes rows in a Postgres DB) to match the current reality with the desired state.

For example: I hope you are aware of the Kubernetes resource called Deployment . One of its key features is to run your workload/application over a specified number of Pods (replicas) and it ensures to always keep on running the specified number of Pods. So, say a Deployment is supposed to run 5 replicas under its umbrella and say you delete two of its Pods. As a part of the reconciliation loop, the Deployment Controller is going to notice this drift between the current state (3 pods) and desired state (5 pods), and it is going to take the action of creating two new pods (because 3+2 = 5) to match/reconcile the current state to the desired state.

What we’re going to do in this article?

We bootstrapped our PostgresWriter operator with some boilerplate code with the help of kubebuilder in the previous article. So, in this article, we’ll do the following things:

  • We’ll go to api/v1/postgreswriter_types.goand setup the structs corresponding to our PostgresWriter resource.
  • We’ll write all the helper functions around connecting with a PostgresDB and inserting/deleting rows from it because that’s the core functionality our operator will perform.
  • Finally, we’ll dive into controllers/postgreswriter_controller.go file and edit the Reconcile() method and code out the main ultimate logic/code behind our operator.
  • IMPORTANT: In this article, I won’t be coding the reconciliation loop to identify the deletes of PostgresWriter resource and be smart enough to DELETE the row (corresponding to the deleted resource) from our Postgres DB accordingly. I would be dedicating the next article to that because that would require an in-depth explanation around predicates, finalizers and attaching multiple reconcilers to the same controller/operator, and I want to smoothly dedicate those topics in a separate article. So, bear with me ❤

Let’s dive in!

Anatomy of our custom resource

First of all, let’s lay everything which we expect from our operator.
Our custom-resource’s instance would look like this:

So these are the following requirements:

  • Its “Group” is demo.yash.com and “Version” is going to be v1 .
  • Its “Kind” is PostgresWriter .
  • It has a required section spec under it.
  • And the spec section has four required fields under it: table (string), name (string), age (integer) and country (string).

I explained about GVK i.e. Group-Version-Kind in the previous part i.e. second part of this article series. Feel free to refer to it if you would like to understand what GVK/Group-Version-Kind means :)

Defining the structure of our custom resource

Go over to api/v1/postgreswriter_types.go .
This file will contain all the structs corresponding to the structure our custom resource will be expected to follow.

Let’s go top-down. Let’s first define the required spec field in the root of our custom resource.

Now, let’s define under thespec that we’ll have the required fields table , name , age and country .

All these comments looking like //+kubebuilder:validation.... are called CRD validation markers which will be recognized by kubebuilder while generating our CRD and will accordingly apply relevant filters and conditions to our final CRD. For checking out all the other CRD validation markers, refer to: https://book.kubebuilder.io/reference/markers/crd-validation.html

We have defined the structure of our PostgresWriter custom resource, so let’s generate our CRD.

Run the following command to generate the CRD.

make manifests

This will generate your CRD at config/crd/bases/

You will notice that the generated CRD has specified spec and age , country , name , table as required fields, and it has defined the types of those individual fields rightfully and even under the age field, minimum: 0 got specified. All thanks to Kubebuilder’s CRD Validation markers :)

Coding out the Postgres stuff

Essentially, at its core our operator is performing two tasks:

  • Running an INSERT query on our PostgresDB when an instance of PostgresWriter gets created.
  • Running a DELETEquery on our PostgresDB when an instance of our PostgresWriter gets deleted.

For making it capable to perform Postgres related stuff, let’s create a pkg/psql/psql.go file and code out all the core functionalities associated with Postgres there.

Let’s start off by declaring the package and importing all the relevant libraries.

Let’s define a struct called PostgresDBClient whose instance will be the point of communication between our operator and the DB.

Now, all the attributes are self-explanatory except DbConnection . This object (pointer to the sql.DB object actually) is going to be used to talk to our DB and perform queries on it.

But for setting it up in the first place, and even returning it (DbConnection) whenever we want, let’s define a method setupAndReturnDbConnection()

This method is idempotent in the sense that it doesn’t blindly create new DB connections everytime it’s called. This method evaluates the PostgresDBClient object and checks its existing DbConnection attribute and if and only if it’s non-existent or has some other issues, then, a new connection will be setup and will be returned. Hence, even calling it repeatedly won’t lead to multiple different results/connections to be unnecessarily created.

Now, let’s define a method for performing the INSERT query.

And let’s define a method for performing the DELETE queries.

Finally, let’s define a constructor (kind of) which will return us a fully-fledged PostgresDBClient object provided the necessary arguments.

And let’s also, define a Close() method which will be used to gracefully clean the PostgresDBClient object in order to avoid any potential memory leaks.

Now, that we are done with setting up the core functionalities around Postgres, we can start coding our operator. But before that, let’s quickly understand the existing bootstrapped code around operator (only the useful parts).

Understanding our bootstrapped controller/operator

Let’s explore the controllers/postgreswriter_controller.go and some parts of main.go file to see how our operator gets instantiated and executed.

Let’s begin with diving into the controllers/postgreswriter_controller.go

You’ll notice PostgresWriterReconciler struct. This struct essentially denotes our operator. It is going to be the one which implements the Reconciler interface and the reconciliation loop which would get triggered whenever something would happen with a PostgresWritercustom resource. It also is capable of talking to our PostgresDB and running INSERT or DELETE queries accordingly. It can talk to the cluster, access and even act on resources as per the permissions it possesses.

  • client.Client — This attribute will be used by our operator (PostgresWriterReconciler) whenever it would want to talk to our Kubernetes cluster and perform any CRUD on it. So, anything like getting/updating/deleting/creating a resource will be done by the already available methods under client.Client .
  • Scheme — This attribute is used to register the struct/type of PostgresWriter to its corresponding GVK (Group: demo.yash.com, V: v1, Kind: PostgresWriter) via the manager. It’s majorly used, behind the scenes, by different clients and caches to correlate and associate Go types/structs to their corresponding GVKs.

Now, proceed further and you’ll notice the Reconcile() method. This is the method which will contain the source code behind the reconciliation loop of our operator. This is where we’ll program our operator to run the PostgresDB queries whenever required. This method gets triggered whenever anything like create/update/delete happens with the PostgresWriter custom resource.

But who/what ensures to call this method whenever something happens with PostgresWriter resource in our cluster?
There comes the SetupWithManager method.

SetupWithManager(mgr ctrl.Manager) — Look carefully how this is defined in the code. It is almost self-explanatory. It is ensuring that whenever anything happens with PostgresWriter resource ( For(&demov1.PostgresWriter{}) ), call the Reconcile() method of our operator/PostgresWriterReconciler ( Complete(r) where r is *PostgresWriterReconciler )

We’re done with controllers/postgreswriter_controller.go , so let’s dive into a small section of main.go

See, our controller/operator ( PostgresWriterReconciler) is getting instantiated and getting attached/setup with the manager.

So whenever, it executes, our operator will be spun up as a part of it as well.

Strategy to setup the PostgresDB Client with our operator

From an efficient perspective, what we want is that, for one instance of our operator, only one instance of PostgresDBClient object should be created and be used to handle all the communications with our PostgresDB, instead of blindly creating a PostgresDBClient and connection for every reconciliation loop.
So, let’s begin by modifying the bootstrapped operator’s piece of code to comply with the above situation of instantiating one Postgres Client whenever the operator gets instantiated.

There’s a heck lot of further tuning you can perform for Postgres in your code such managing the connection pool, Connection Timeouts, Idle Timeouts, etc. But we won’t go into the rabbit hole of that stuff. This article focuses on Kubernetes operator dev :)

Modifying our bootstrapped controller/operator

We’ll be making some modifications to the above two files to make this operator more compliant and operable with our use-case. We want our operator to be tied to one single PostgresDB connection whenever its spun up. We don’t want it to create/destroy/re-create a Postgres DB connection everytime it reconciles something.

First of all, let’s go to controllers/postgreswriter_controller.go and edit the PostgresWriterReconciler struct in it to even have a PostgresDBClient object as an attribute, which will be instantiated for only one time in the beginning when our operator will be started. This will ensure that whenever the reconciler of our operator (controller) runs, it will use the existing PostgresDBClient object to talk to our PostgresDB host everytime instead of creating a new PostgresDBClient object and connection everytime which can lead to slower response times and potential memory leaks.

Now, let’s go back to main.go where our operator is instantiated and let’s augment it to define and attach the PostgresDBClient to our operator’s (PostgresWriterReconciler’s) instance.

Firstly, let’s parse the Postgres related config from environment variables and capture them.

Now, hop to the section in the main() function where our PostgresWriterReconciler is being instantiated and being setup with the manager. We will edit it to contain the PostgresDBClient object as well.

Phew! The messy part is done!
Now, let’s code the main chunk of our operator :D

Coding our operator’s heart (or brain?)

Before we move on, I would like to mention again that in this article, the part of the operator which I am about to code will only deal with running the INSERT query on our PostgresDB. The aspect of capturing “Create” and “Delete” events separately and running different pieces of code for them would require me to explain the concept of predicates, finalizers and attaching multiple reconcilers, and I think that already this article must be feeling quite heavy for you folks now :P

But don’t worry, I got your back. I will be covering all of that with full-depth in the next article of this series :)

So, if you look at the controllers/postgreswriter_controllers.go file and look at the piece of code under SetupWithManager method under it, you’ll notice that that piece of code is ensuring that whenever anything happens with aPostgresWriter resource, including it being created, the Reconcile() method of PostgresWriterReconciler (our operator) will be called.

Awesome! Now considering that being ensured, let’s just write the Reconcile() method with the functionality which needs to be performed whenever a PostgresWriter resource gets created i.e.

  • Fetch the table, name, age, countryfields from the incoming PostgresWriterresource.
  • Form a unique id corresponding to that incoming resource. namespace/name of that incoming resource.
  • Send the above variables to the PostgresDBClient.Insert() method to run the INSERT query on our DB.

I know, I know that I did a blunder here

I know that the way SetupWithManager and Reconcile methods are coded currently, they will cause our operator to blindly capture any sort of event like create, update or delete with a PostgresWriter resource and run the Reconcile() method in response to that, which is currently running the INSERT operation only. I am aware that that is absolutely horrible because the same INSERT operation will run even on updates and deletes of any PostgresWriter resource.

Ideally, our operator should be smart enough to only capture “Create” event of a PostgresWriter resource and run the code of INSERT query in our DB. And our operator should be further smart enough to recognize the “Delete” event of a PostgresWriter resource and accordingly run a DELETE query in our DB to keep things consistent between our cluster’s PostgresWriterresources and our Postgres DB (as I mentioned in the beginning of this article).

But implementing these capabilities would involve dealing with predicates, finalizers and attaching multiple reconcilers. And as I mentioned in the previous section, I’d love to dedicate the next article to that (and release it super soon). So, do stay tuned for it, it’s gonna be really fun :D

Let’s run it 🤞

First of all, setup all the environment variables in accordance with your Postgres DB’s config such as Host, Port, Db, Username and Password.

For myself, I have hosted a PostgresDB called postgres on 54.166.146.81 which can be accessed via the user postgres and password as password (secure, ikr!)

export PSQL_HOST=54.166.146.81
export PSQL_PORT=5432
export PSQL_DBNAME=postgres
export PSQL_USER=postgres
export PSQL_PASSWORD=password

Now, although we have run the following command before, run it again (just in case you missed it) so as to setup the CRDs with respect to the latest state of our code.

make manifests

Now, let’s deploy our CRDs.

make install

Finally, let’s run our operator.

make run

You will start seeing logs like:

To test, let’s write the following sample.yaml

apiVersion: demo.yash.com/v1
kind: PostgresWriter
metadata:
name: sample-student
namespace: default
spec:
table: students
name: alex
age: 23
country: canada

Upon kubectl apply -ing this resource YAML, our operator’s Reconcile() method get triggered and result in inserting a row in our Postgres DB in the table students with the id default/sample-student and the provided name, age and country.

So, before applying the above sample.yaml, this is how the studentstable in my DB looks:

no rows in “students” table

Now, I’ll run

kubectl apply -f sample.yaml

Now, if I check my DB

The row inserted by our operator is visible with the right values

Yay! It works!

End note

Thanks a lot for reaching till here and I know this was a long article but well, that’s Kubernetes Operator dev :’(

In the next article, we will be completing the development of the “PostgresWriter” operator by adding the capability to capture and handle the “Delete” events of PostgresWriterresource as well. We will explore the concepts of finalizers, predicates and correlating multiple reconcilers to one controller/operator.

I really hope you understood this article. If not, please feel free to raise any questions/doubts or even reach out to me on any of my social media handles.

Twitter — https://twitter.com/yashkukreja98

Github — https://github.com/yashvardhan-kukreja

LinkedIn — https://www.linkedin.com/in/yashvardhan-kukreja

Until then,
Adios!

--

--

Yashvardhan Kukreja

Software Engineer @ Red Hat | Masters @ University of Waterloo | Contributing to Openshift Backend and cloud-native OSS