Consistency, Availability, Partition Tolerance: You can have any two of them in a database system. This is by far the second most stupid thing I have encountered in a very long time. Seconded only by flat earth theory. The problem is, it seems intuitive yet it could not be farther from the truth.
If the title didn't make it clear already, and this not to clickbait or anything, I absolutely hate the way CAP Theorem is interpreted by most developers in my experience. And as a consequence, with no disrespect to Eric Brewer, I believe CAP Theorem in general is stupid.
This interpretation is mostly done by inexperienced developers, and there definitely is a sampling bias on the internet, most people who have had enough experience to work with databases, aren't writing daily tech blogs or posting graphic strips on instagram. And I completely hold these "unproductive" influencers (subtle Rick and Morty reference) accoundable for sperading the misinformation.
What is CAP Theorem (as interpreted)?
Consistency: A certain read must contain the most recently written data.
Availability: The system must respond with data (if available)
Partition Tolerance: The system can function with partitions, i.e. the data is stored across multiple nodes.
CAP Theorem states that you can have any two of the above traits in a database system.
What's wrong with CAP Theorem?
I don't exactly have a problem with CAP Theorem, per se, but rather the way it is interpreted and propagated. There are companies who conduct interviews, and expect candidates to answer this way, and then expect the hired developers (or worse, DBAs) to understand databases. Ironic.
The frustration of understanding CAP Theorem
The biggest problem of CAP Theorem is that it fails to communicate the context. Under which circumstances does it hold up? Is it a general rule? I used to believe that to be the case. CAP Theorem holds up only in the worst case scenario. And if that is a revelation to you, welcome to the club.
And then comes the interesting part, you can have any two out of the three. Like some matrix level paradox of choice or something.
Going by the common interpretation, we come to the following conclusions about DB systems:
- CP -> Consistent and Partition tolerant, but unavailable. What use is a DB that is unavailable?
- CA -> Consistent and Available, but not partition tolerant. So, does it mean you must run it on a single node?
- AP -> Available and Partition Tolerant, but inconsistent. That kind of makes sense, I write some data, and until that is written and propagated, I might get "inconsistent" reads.
And you see the problem. Two of three combinations does not hold up.
Let's reinterpret using our newfound knowledge of the worst case scenario
- CP -> If there is a partition failure, the system will not return any data, citing possible inconsistency.
- CA -> This still does not makes sense.
- AP -> In case of partition failure, it will return stale data (aka soft state) instead of not returning any data.
Interpretation of CAP Theorem that makes sense
- Satisfying C, A and P is possible. Which means, a distributed system can be in a consistent and available state even in presence of partitions.
- P is never an option, either you have Partitions or you don't, in case you have a single node running, you don't need any theorem, if your DB system does not crash, it is Consistent AND available.
- Only if you have a network Partition, and something fails (in the network), only in that case you have to choose between Consistency and Availability.
- If you are fetching someone's bank balance, you'd rather live with an error than reading a false data, choose consistency
- If you are fetching the number of likes, you'd rather live with stale data than not showing any data, choose availability
While the above interpretation makes sense, it does not make CAP Theorem any better. For the following reasons
- Lack of clarity: CAP Theorem, as stated, essentially creates a false image. And everyone who knows the correct interpretation, has been corrected. This is less of a science and more of a religious voodoo.
- Aha! I get it: It is easy to understand CAP theorem, and have a wrong understanding. Which gives a false impression of confidence. Contributing to CAP theorem's growing misunderstanding
- Incomplete: What if there is no network failure, and still a system returns stale data? For the sake of latency. Such as Cassandra!
- Backwards: PAC would be a better albeit less rhetorical and charming name. PAC would communicate clearly that, Given P, choose A or C.
PACELC for the win
PACELC is simply a better framework to design and evaluate distributed systems.
PACELC is simply an extension of CAP Theorem. It states that, Given a Partition failure, one can priotize Consistency or Availability, Else, a system has to prioritize Consistency or Latency.
I get it, PACELC doesn't have the same ring. But it is a much clear, complete and useful tool to design or evaluate a database system.
In PACELC there are two parts
- PAC: Given a Partition failure, you can prioritize Availability or Consistency
- ELC: Else, normally, you can prioritize Latency or Consistency.
As you can see, while CAP only applies in the worst case, PACELC is a much more general representation and gives a clearer picture.
How to understand PACELC names.
PACELC names, usually have two components, delimited by a slash. for example, Cassandra is PA/EL while MySQL is PC/EC.
The first part, PA or PC defines how the system prioritises partition failures.
- PA: This systems prioritizes availability by maintaining soft state.
- PC: This systems prioritizes consistency by not returning any data
The second part, EC or EL defines how the system prioritizes if there's no partition failure present
- EL: These systems prioritizes Latency, fast response times are their bread and butter
- EC: These systems prioritizes Consistency, they can respond after sometime.
Examples of each type of DB Systems
- PA/EL: Cassandra, DynamoDB, CosmosDB
- PA/EC: MongoDB
- PC/EC: MySQL
- PC/EL: PNUTS
As you can see, PACELC is a much better, practical and actually useful way to categorize database systems in a meaningful way. And CAP theorem, on the other hand, I have never seen anyone actually cite it outside of lectures, exams or interviews.
CAP Theorem sucks because it is thoroughly misinterpreted, kinda meaningless and obsolete.
PACELC is an awesome framework to keep in mind when designing or evaluating a database for a certain use-case. use PACELC, sleep better.
To Conclude the rant
I think I have justified my rant at CAP Theorem. It is kind of useless outside any examination or interview. I have not seen anyone ever name it. And if companies keep asking this question, engineers will keep mugging this up, propagating misinformation along the way.
CAP Theorem needs to be forgotten, purged and kept as a relic of computer science history. Just like PERL.
A Critique of CAP Theorem from the University of Cambridge
And here is a blog from IBM that, on the surface defines the CAP Theorem as such, without highlighting its weaknesses, to promote its cloud. As they have a lot of internal links, they should have totally educated potential developers about PACELC Before selling their cloud to them