CAP Theorem and blockchain

In theoretical computer science, the CAP theorem states that it is impossible for a distributed data store (such as a blockchain network) to simultaneously provide more than two out of the three guarantees: Consistency, Availability & Partition tolerance.

CAP Theorem – the parts

  • C: Consistency – At any given time, all nodes in the network have exactly the same (most recent) value.
  • A: Availability – Every request to the network receives a response, though without any guarantee that returned data is the most recent.
  • P: Partition tolerance – The network continues to operate, even if an arbitrary number of nodes are failing.

Due to the nature of distributed data stores (such as blockchain), Partition tolerance is a given fact; there will always be failing/unreachable nodes in the network (not least because of the unstable nature of the internet). CAP Theorem states that one has to choose between C (Consistency) or A (Availability) when in the presence of P (Partition):

Availability over Consistency (A + P)

Every request to the network receives a response, even if the network cannot guarantee it is up to date due to network partitioning (failing nodes).

Choosing Availability over Consistency for a world-wide distributed system will make it highly available, but its data will be out of date for 99.99% of the time. Furthermore, no-one will be able to guarantee that the data returned is in fact the most recent. The best example is Facebook, as it's much more important to get into the network, than to see the latest updates of a particular person (depends who you ask, of course!).

Consistency over Availability (C + P)

The system will return an error or a time-out if particular information cannot be guaranteed to be up to date due to network partitioning (failing nodes). Choosing Consistency over Availability for a world-wide distributed system will make it highly accurate, but it will most likely be unavailable for 99.99% of the time.

Blockchain

As blockchain networks (such as Bitcoin) are distributed systems, they have to deal with Partition tolerance. Out of the two options that remain, they choose Availability over Consistency. If, for example, Bitcoin would instead choose consistency, it would mean that in the event of a connectivity problem, or any failing node/component (which happens all the time), you wouldn’t be able to use/send/receive Bitcoin. Blockchain networks in general use other means of becoming eventually consistent over time: through mining and blocks.

So, which one does Bitcoin sacrifice? Availability, consistency, or partition tolerance?

  • If Bitcoin sacrifices partition tolerance, this means that Bitcoin can only be run on networks that reliably deliver messages.
  • If Bitcoin sacrifices availability, this means that the Bitcoin network pauses when transactions can’t be committed. So, if there’s a connectivity problem, you can’t read or write Bitcoin data at all.
  • If Bitcoin sacrifices consistency, this means that new Bitcoin transactions aren’t guaranteed to be committed. So, you send a transaction and it may or may not be received by other parties.

So, what’s the verdict?

Well, Bitcoin is “eventually consistent,” which is a polite way to say that it is not consistent. That is, if you send a Bitcoin transaction, there’s no guarantee that it will be received. Sounds bad? Maybe.

First, remember that the CAP theorem applies to all financial systems just as it applies to Bitcoin. A credit card is a good example of an “eventually consistent” system: typically, you pay a bill only once a month. This makes it easy to violate the constraints of the system by spending more money than you have. Of course, this risk is (theoretically) mitigated because you know that the credit card companies will ruin your credit score if you do it.

Eric Brewer, the originator of the CAP theorem, pointed out in 2012 that the CAP theorem only prohibits a tiny fraction of the design space of distributed systems; it’s still possible to create many useful and robust distributed systems. Although Bitcoin doesn’t provide firm guarantees that your transactions have been committed, in practice it does a pretty good job. The longer you wait, the more confident you can be that a transaction has gone through successfully:

In a matter of seconds, the transaction will be relayed around the world. At this point, it will be difficult but possible to reverse.

After an average of five minutes, the transaction will be inserted into a block. About 999 in 1000 blocks make it into the blockchain, so you can feel pretty good at this point. 1 in 1000 blocks is an orphan block; if the transaction ends up in an orphan block, you just ignore it and wait until the transaction goes into a non-orphan block.

The longer you wait, the more blocks will pile up on top of the block containing your transaction, and the more certain you can be that your transaction won’t be rolled back. In practice, the longest ever fork in the blockchain was 24 blocks in length, and even this was a one-time event caused by a software bug. To date, no transaction more than five hours old has ever been rolled back.

To read about ways to intentionally exploit Bitcoin’s consistency guarantees, wait for the double spending article or find it on the Bitcoin Wiki.

Comments

Popular posts from this blog

Length extension attack

Contract upgrade anti-patterns