Develop on Kubernetes Series — Operator Dev Finale — Dive into the Finalizers
--
Hi folks! In the last article, we wrote a very basic operator which was blindly watching for the all the events happening with the corresponding custom resource and was reconciling all those events as if they were only “create” events with our resource, and was missing out on handling “delete” events associated with our custom resource gracefully.
In this article, we’ll see how can we gracefully control the deletion lifecycle of our custom resource by making use of “Finalizers” in our Reconciliation loop to take some necessary pre-delete steps whenever our custom resource gets deleted anywhere.
So Let’s dive in!
A quick walkthrough of our operator
Feel absolutely free to skip this section if you have already gone through the last article :)
So the idea was to create an operator which watched a custom resource called PostgresWriter
and depending on the events which happened with this custom resource (did it get created? did it get deleted?) this operator inserted/deleted rows in a Postgres DB sitting somewhere far out there.
Our operator, at the time of startup, would be fed with Postgres DB’s Host IP, DbName, username, password, etc. so that it could perform any CRUD operation on that PostgresDB depending on the use-case.
For example, when the abovePostgresWriter
resource would be applied/created in a cluster, a new row will be INSERT
-ed in that Postges DB in the table students
with the values alex
, 23
and canada
for the columns name
, age
and country
respectively in that table.
Also, whenever this PostgresWriter
resource will be deleted from our cluster, the row corresponding to it in the Postgres DB, will be accordingly DELETE
-ed to keep the PostgresWriter
resource and the rows in our PostgresDB consistent with each other.
What we did in the last article?
Feel absolutely free to skip this section if you have already gone through the last article :)
We wrote the first prototype of our operator which was actually pretty dumb :P
The operator just watched over every event happening with any PostgresWriter
resource in the cluster and just triggered the Reconcile()
method for every such event. The Reconcile()
method just performed an INSERT
row operation on our Postgres DB.
So basically, our operator was INSERT
-ing a row in the PostgresDB for every event happening with any PostgresWriter
resource in the cluster.
Well, that’s bad!
What do we want then?
Ideally, what we want our operator to do is to even synchronize the “deletes” of PostgresWriter
custom resource. Meaning that whenever a PostgresWriter
custom resource is deleted, even the row corresponding to it in our Postgres DB is deleted so as to keep the rows in our DB and the PostgresWriter resources in our cluster consistent with each other.
But Yash! That sounds pretty easy. Whenever the reconcile function is called due to a delete event associated with a
PostgresWriter
resource, just run aDELETE
query on your DB! Ez!
Well, it’s not that straightforward, at least in our situation.
To delete a row in the DB, we need to know two things:
- namespace/name combination of the
PostgresWriter
resource getting deleted because if you remember thenamespace/name
of aPostgresWriter
resource acts like the primary id of the row corresponding to it in our Postgres DB. - the
spec.table
field of thePostgresWriter
resource getting deleted because obviously, we need to know the name of the table from which the row is supposed to be deleted.
The first requirement can be easily accomplished from the req ctrl.Request
parameter in the Reconcile(ctx context.Context, req ctrl.Request)
method in our postgreswriter_controller.go
file because the req
contains the information about the name and namespace of the resource for which Reconcile()
is called.
The problem occurs with fetching spec.table
because if you think about it, the Reconcile()
gets triggered “after” the PostgresWriter resource is deleted, which means that we can’t simply get the spec.table
field anymore because it went away with the deleted PostgresWriter
resource.
So, we want to capture a field of a resource just before it’s about to get deleted.
But how can we make our operator do so because our operator kicks in the Reconcile()
method “after” an event (like deletion) happens with our PostgresWriter resource.
DeletionTimestamp!
When you run the kubectl delete
command for any resource in Kubernetes, the resource doesn’t instantly get deleted. Instead, the first step is actually an Update
event where that resource’s metadata.deletionTimestamp
gets updated to current time (by the kube-api-server). A non-nil metadata.deletionTimestamp
on any Kubernetes resource indicates that that resource is scheduled to be deleted.
And this Update
event can be captured and recognized by a controller/operator and accordingly, it can take pre-delete steps and then, allow the object to continue getting deleted, it’s that simple.
But how can we indicate to our operator which pre-delete tasks to do and when to do them?
Enter the finalizers!
Finalizers are, basically, a slice of strings which are defined under the metadata
of a resource in Kubernetes. Each string/element in the finalizers slice corresponds to a pre-delete task to be performed by the controller/operator watching that resource before the deletion of that resource gets executed.
In our case, everyPostgresWriter
resource’s finalizers slice will contain one element , which is, finalizers.postgreswriters.demo.yash.com/cleanup-row
and this finalizer will correspond to the task of DELETE
-ing the row corresponding to this resource from our PostgresDB.
By the way, you can choose any other random name for this finalizer as well such as
yeet-row
or anything else.
And we will programme our operator in such a way that whenever it will capture and recognize a PostgresWriter
with a non-nil metadata.deletionTimestamp
, it will simply start executing the pre-delete task of DELETE
-ing the row from the PostgresDB and cleaning up the finalizers.
In all the other cases/events, it will ensure that the PostgresWriter
only has the right finalizers and no gibberish defined under it (as someone might do that mischiveously and break the deletion process).
So, let’s begin with the code!
First let’s define the slice of finalizers globally in the postgreswriter_controller.go
file
Let’s proceed with defining a method called cleanupRowFinalizerCallback()
which will actually contain the piece of code to perform the DELETE
query on our PostgresDB and cleanup the finalizer.
Now, moving on to our Reconcile()
method where we will make use of the above method.
Firstly, we’ll define an “if” block to ensure that if the event is NOT related to a “delete event” of a PostgresWriter
resource (when metadata.deleteTimestamp
is nil/zero), then the corresponding resource should have only the right set of finalizers under it.
Secondly, we’ll define another “if” block to see if the event is related to a “delete event” of a PostgresWriter
resource (when metadata.deletionTimestamp
is non-nil), then make use of cleanupRowFinalizerCallback()
method we define above to take the pre-delete setups and proceed further.
Just that’s it! Your deletion flow is ready.
Let’s look at it in action
Firstly, let’s start the operator
Initial state of the DB and the students
table
Open another terminal session and apply the following PostgresWriter yaml manifest
Now, let’s proceed with the deleting the resource
Look at the DB now, the row got deleted as well! Yay!
That’s it!
Wanna check out the entire project?
Reach out to https://github.com/yashvardhan-kukreja/postgres-writer-operator to check out the whole code associated with this PostgresWriter operator.
Note: The operator we wrote has a lot of flaws currently around idempotency, testing, packaging and deployments but actually, this series was always meant to just lay the beginner-friendly foundations of Kubernetes operator dev without going into the rabbit holes of other advanced topics, plus this series was getting super long anyways :P
Nonetheless, I look forward to dedicating totally separate individual articles apart from this series to these advanced topics, so, no worries :)
What next?
Thanks a lot for reaching till here! This is the end of the last part of the “operator dev” series. But is that it with Operator dev? Heck no! Kubernetes Operator Dev is an extremely in-depth field in itself and I have seen even senior engineers call themselves not a pro at it. The possibilities for learning more in this area are countless. This series was meant to give you a fundamental push in the area of Kubernetes operator dev.
Rest, you can proceed with exploring the other more advanced aspects of Kubernetes operator dev such as controller testing with envtest, setting up multi-version API with Hub & Spoke model, Status reporting in operator, and so much more!
And, I will still keep on coming with new articles covering a bunch of topics (including the above) around Kubernetes Dev. So, stay tuned!
In case of any doubts around this article, feel free to reach out to me on my social media handles
Twitter — https://twitter.com/yashkukreja98
Github — https://github.com/yashvardhan-kukreja
LinkedIn — https://www.linkedin.com/in/yashvardhan-kukreja (expect slow replies here as I mostly stay inactive on it :P)
Until next time,
Adios!