Develop on Kubernetes Series — Operator Dev Finale — Dive into the Finalizers

Hi folks! In the last article, we wrote a very basic operator which was blindly watching for the all the events happening with the corresponding custom resource and was reconciling all those events as if they were only “create” events with our resource, and was missing out on handling “delete” events associated with our custom resource gracefully.

In this article, we’ll see how can we gracefully control the deletion lifecycle of our custom resource by making use of “Finalizers” in our Reconciliation loop to take some necessary pre-delete steps whenever our custom resource gets deleted anywhere.

So Let’s dive in!

https://raw.githubusercontent.com/cncf/artwork/master/projects/kubernetes/stacked/all-blue-color/kubernetes-stacked-all-blue-color.png

A quick walkthrough of our operator

Feel absolutely free to skip this section if you have already gone through the last article :)

So the idea was to create an operator which watched a custom resource called PostgresWriter and depending on the events which happened with this custom resource (did it get created? did it get deleted?) this operator inserted/deleted rows in a Postgres DB sitting somewhere far out there.
Our operator, at the time of startup, would be fed with Postgres DB’s Host IP, DbName, username, password, etc. so that it could perform any CRUD operation on that PostgresDB depending on the use-case.

For example, when the abovePostgresWriter resource would be applied/created in a cluster, a new row will be INSERT -ed in that Postges DB in the table students with the values alex, 23 and canadafor the columns name , age and country respectively in that table.

Also, whenever this PostgresWriterresource will be deleted from our cluster, the row corresponding to it in the Postgres DB, will be accordingly DELETE-ed to keep the PostgresWriterresource and the rows in our PostgresDB consistent with each other.

What we did in the last article?

Feel absolutely free to skip this section if you have already gone through the last article :)

We wrote the first prototype of our operator which was actually pretty dumb :P
The operator just watched over every event happening with any PostgresWriter resource in the cluster and just triggered the Reconcile() method for every such event. The Reconcile() method just performed an INSERT row operation on our Postgres DB.

So basically, our operator was INSERT -ing a row in the PostgresDB for every event happening with any PostgresWriter resource in the cluster.

Well, that’s bad!

What do we want then?

Ideally, what we want our operator to do is to even synchronize the “deletes” of PostgresWriter custom resource. Meaning that whenever a PostgresWriter custom resource is deleted, even the row corresponding to it in our Postgres DB is deleted so as to keep the rows in our DB and the PostgresWriter resources in our cluster consistent with each other.

But Yash! That sounds pretty easy. Whenever the reconcile function is called due to a delete event associated with a PostgresWriter resource, just run a DELETE query on your DB! Ez!

Well, it’s not that straightforward, at least in our situation.

To delete a row in the DB, we need to know two things:

  • namespace/name combination of the PostgresWriter resource getting deleted because if you remember the namespace/name of a PostgresWriter resource acts like the primary id of the row corresponding to it in our Postgres DB.
  • the spec.table field of the PostgresWriter resource getting deleted because obviously, we need to know the name of the table from which the row is supposed to be deleted.

The first requirement can be easily accomplished from the req ctrl.Request parameter in the Reconcile(ctx context.Context, req ctrl.Request) method in our postgreswriter_controller.go file because the req contains the information about the name and namespace of the resource for which Reconcile() is called.

The problem occurs with fetching spec.table because if you think about it, the Reconcile() gets triggered “after” the PostgresWriter resource is deleted, which means that we can’t simply get the spec.table field anymore because it went away with the deleted PostgresWriter resource.

So, we want to capture a field of a resource just before it’s about to get deleted.
But how can we make our operator do so because our operator kicks in the Reconcile() method “after” an event (like deletion) happens with our PostgresWriter resource.

DeletionTimestamp!

When you run the kubectl delete command for any resource in Kubernetes, the resource doesn’t instantly get deleted. Instead, the first step is actually an Update event where that resource’s metadata.deletionTimestamp gets updated to current time (by the kube-api-server). A non-nil metadata.deletionTimestamp on any Kubernetes resource indicates that that resource is scheduled to be deleted.

And this Update event can be captured and recognized by a controller/operator and accordingly, it can take pre-delete steps and then, allow the object to continue getting deleted, it’s that simple.

But how can we indicate to our operator which pre-delete tasks to do and when to do them?

Enter the finalizers!

Finalizers are, basically, a slice of strings which are defined under the metadata of a resource in Kubernetes. Each string/element in the finalizers slice corresponds to a pre-delete task to be performed by the controller/operator watching that resource before the deletion of that resource gets executed.

In our case, everyPostgresWriter resource’s finalizers slice will contain one element , which is, finalizers.postgreswriters.demo.yash.com/cleanup-row and this finalizer will correspond to the task of DELETE -ing the row corresponding to this resource from our PostgresDB.

Look at metadata.finalizers

By the way, you can choose any other random name for this finalizer as well such as yeet-row or anything else.

And we will programme our operator in such a way that whenever it will capture and recognize a PostgresWriter with a non-nil metadata.deletionTimestamp , it will simply start executing the pre-delete task of DELETE -ing the row from the PostgresDB and cleaning up the finalizers.

In all the other cases/events, it will ensure that the PostgresWriter only has the right finalizers and no gibberish defined under it (as someone might do that mischiveously and break the deletion process).

So, let’s begin with the code!

First let’s define the slice of finalizers globally in the postgreswriter_controller.go file

Let’s proceed with defining a method called cleanupRowFinalizerCallback() which will actually contain the piece of code to perform the DELETE query on our PostgresDB and cleanup the finalizer.

Now, moving on to our Reconcile() method where we will make use of the above method.

Firstly, we’ll define an “if” block to ensure that if the event is NOT related to a “delete event” of a PostgresWriter resource (when metadata.deleteTimestamp is nil/zero), then the corresponding resource should have only the right set of finalizers under it.

Secondly, we’ll define another “if” block to see if the event is related to a “delete event” of a PostgresWriter resource (when metadata.deletionTimestamp is non-nil), then make use of cleanupRowFinalizerCallback() method we define above to take the pre-delete setups and proceed further.

Just that’s it! Your deletion flow is ready.

Let’s look at it in action

Firstly, let’s start the operator

Initial state of the DB and the students table

Open another terminal session and apply the following PostgresWriter yaml manifest

YAML to apply
kubectl apply-ing the resource and checking it
PostgresDB “students” table after kubectl apply-ing the above resource

Now, let’s proceed with the deleting the resource

kubectl delete-ing the resource
logs corresponding to the deletion and finalizers

Look at the DB now, the row got deleted as well! Yay!

That’s it!

Wanna check out the entire project?

Reach out to https://github.com/yashvardhan-kukreja/postgres-writer-operator to check out the whole code associated with this PostgresWriter operator.

Note: The operator we wrote has a lot of flaws currently around idempotency, testing, packaging and deployments but actually, this series was always meant to just lay the beginner-friendly foundations of Kubernetes operator dev without going into the rabbit holes of other advanced topics, plus this series was getting super long anyways :P
Nonetheless, I look forward to dedicating totally separate individual articles apart from this series to these advanced topics, so, no worries :)

What next?

Thanks a lot for reaching till here! This is the end of the last part of the “operator dev” series. But is that it with Operator dev? Heck no! Kubernetes Operator Dev is an extremely in-depth field in itself and I have seen even senior engineers call themselves not a pro at it. The possibilities for learning more in this area are countless. This series was meant to give you a fundamental push in the area of Kubernetes operator dev.

Rest, you can proceed with exploring the other more advanced aspects of Kubernetes operator dev such as controller testing with envtest, setting up multi-version API with Hub & Spoke model, Status reporting in operator, and so much more!

And, I will still keep on coming with new articles covering a bunch of topics (including the above) around Kubernetes Dev. So, stay tuned!

In case of any doubts around this article, feel free to reach out to me on my social media handles

Twitter — https://twitter.com/yashkukreja98

Github — https://github.com/yashvardhan-kukreja

LinkedIn — https://www.linkedin.com/in/yashvardhan-kukreja (expect slow replies here as I mostly stay inactive on it :P)

Until next time,
Adios!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store