Understanding CAP theorem: what you sacrifice and why it matters in system design


In a system design interview there is a step when you should decide: should your system prioritize consistency or availability?

There is a bit of theory to know before answering to this question.

The CAP theorem states that in a distributed data store, it is impossible to simultaneously guarantee all three of the following properties: Consistency, Availability, and Partition Tolerance.

Classical visualisation of the CAP theorem

Understanding the Components of the CAP Theorem

Consistency: Every read receives the most recent write or an error. This means that all nodes in the system return the same data at the same time, ensuring that any changes made to the data are immediately visible to all clients.

Example of a highly consistent application is a finance application. Customers should not be able to overspend due to delayed updates of their balance.

Availability: Every request received by a non-failing node must result in a response, regardless of whether it contains the most recent data. This ensures that the system remains operational and responsive, even if some nodes are down.

Example of a highly available application is social media: you get some updates, even if not all of them. It is enough to support your engagement with the app.

Partition Tolerance: The system continues to operate despite arbitrary network partitions. This means that the system can still function even if there are communication breakdowns between nodes.

The Trade-offs

According to the CAP theorem, a distributed system can only provide two of the three guarantees at any given time. This leads to several possible configurations:

  • CP (Consistency and Partition Tolerance): The system prioritizes consistency and will sacrifice availability during network partitions. It may return errors or timeouts if it cannot guarantee the most recent data.
  • AP (Availability and Partition Tolerance): The system prioritizes availability and will return responses even if they may not be the most recent data. This can lead to temporary inconsistencies.
  • CA (Consistency and Availability): This configuration is theoretically possible only when there are no network partitions, which is often not practical in real-world distributed systems.

Therefore, it is quite safe to assume there is always Partition Tolerance in the distributed systems. Also Hello Interview platform supports this conclusion by stating that “partition tolerance is a given in distributed systems”.

What to choose?

The CAP theorem forces you to prioritize either Consistency or Availability. Which characteristic of the system is most important is based on the nature of your application. Questions like “what is important for the user: speed or truthfulness of the data?” or “which situations the system should avoid?” should help you decide.

There exist systems that may require both Availability and Consistency, but for the different parts of the system. For example, in the ticket booking system you would prioritize Availability when searching for the events, and Consistency when make a money transaction to buy a ticket.


📚 References & further reading

Video from ByteByteGo

Hello Interview CAP theorem