Day 8 — CS Fundamentals December — CAP Theorem

Initially, there were centralized systems where one computer performed processing for one dedicated task.

There came a time when it was required to move to centralized systems to distributed systems because there is a limit to which you can make centralized system powerful.

But you can achieve immensely higher processing power with a distributed system at the cost of slight and negligible network latency.

Distributed system is a network of multiple computers which are meant to accomplish a task or a bunch of task with their aggregated and combined processing power.

So, distributed systems came in to provide reliability. But guess what, like everything in life, if something provides a greater amount of reliability, it has to sacrifice or sabotage on some certain considerable things.

In case of designing distributed systems, the sacrifices and trade offs required to be made are highlighted by the CAP Theorem and it is an extremely important theorem to consider whenever you step into designing distributed systems.

Let’s dive in!

So, a distributed system is basically a network of computers placed separately but communicating and coordinating tasks and data between them so as to solve a shared problem.

So, while designing a distributed system, the three pillars or fundamental promises to consider for usage of it are:

  • Consistency — Basically, if you WRITE to computer 1 in the network and then, read from computer 2 in the network, the computer 2 will also return the newly WRITTEN value on computer 1. This means that computer 1 is consistent with computer 2.
  • Availability — Basically, availability means that if you talk/comunicate/request to a running computer in the network (distributed system), it WILL respond, meaning that it is AVAILABLE.
  • Partition Tolerance — Partition Tolerance means if the network is partitioned, then the other promises/pillars will still persist and work. The partitions can be individual computers, laptops and even data center.

For example, if you developed the system so as to maintain consistency and then, you apply partition tolerance then, this would mean that even if the network between computers is partitioned, the system will still stay consistent amongst the different partitions.

What is CAP Theorem then?

CAP Theorem says that while designing a distributed system, you have sacrifice atleast one of the promises or pillars i.e. Consistency, Availability and Partition Tolerance. Meaning that you can have only a max. of two of those pillars for your distributed system.

Proof

Consider two computers in a distributed system — Computer 1 and Computer 2 both of them having a variables x = 0 and x = 0 in their own.

For Consistency and Availability but NOT Partition Tolerance

They are connected through a network.

Here, let’s say, a WRITE happens to computer 1 and x becomes 1 (x=1) in computer 1.

Now, through the network, the computer 1 can communicate and tell computer 2 to update its own x =0 to x = 1 for consistency.

So see, due to the network/connection being present, they were able to communicate instantly maintain consistency and were able to stay available throughout the time with zero downtime.

So, hence, the system here is Consistent and Available. But we can’t say if its Partition Tolerant because it needed and used network connection between computers to maintain consistency and availability.

Now, to see what happens, if we try to test Partition Tolerance.

For that, let’s break the network/communication point between the computers.

So, now, the computers are network-ly separated meaning that they won’t be able to communicate no matter what.

For Consistency and Partition Tolerance but NOT Availability

When the x variable is updated in computer 1, if you try to read the value from computer 2, it will say,

Dude, you know what! I don’t know if computer 1 has updated the ‘x’ or not, so, I won’t give provide you the value of ‘x’ until we get connected and confirm consistent value of ‘x’

Here, computer 2 is NOT being available but it is making sure to stay consistent basically by making sure that no matter what happens, it would NOT provide inconsistent data (wrong/old value of x) at any risk.

Hence, here the system is Consistent (no return-ing of wrong value of x by any computer) and Partition Tolerant (no connection) but it is NOT available (not being able to provide service some of the times).

For Availability and Partition Tolerance but NOT Consistency

When the x variable is updated in computer 1, if you try to read the value from computer 2, it will say,

“Bro, you know what! I don’t know if computer 1 has updated the ‘x’ or not but what I can do is I can provide the latest value of ‘x’ which I have

Here, computer 2 is willing to sacrifice consistency by providing a wrong/older value of x just to make sure it is always AVAILABLE and providing atleast some value of ‘x’

Hence, here the system is Available (being able to provide the service all the time) and Partition Tolerant (no connection) but it is NOT consistent (not guaranteeing latest and right value of ‘x’).

Conclusion

So clearly, we saw that when we try to attain Partition Tolerance, we have to sacrifice either consistency or availability.

And if we want to attain both Consistency and Availability, then we have to give Partition Tolerance.

That’s it!

Thank you for reaching till here :)

Hope you understood this article and significance of CAP theorem in designing distributed systems.

Find me on

LinkedInhttps://www.linkedin.com/in/yashvardhan-kukreja-607b24142/

GitHubhttps://www.github.com/yashvardhan-kukreja

Email — yash.kukreja.98@gmail.com

Adios!

Site Reliability Engineer @ Red Hat | ex-Grofers | Contributing to {Cloud, Kubernetes}-native OSS