The Herlihy-Wing paper defining linearizability is a good read https://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf and explains the difference with good examples, although it might be too mathy/formal for most people's taste. An interesting result in that paper is that the implementations of what Abadi calls isolation notions (which Herlihy and Wing call multi-object correctness criteria) are not modular - at least for (strict) serializability. For example, if two systems implement serializability using two-phase locking and multi-version concurrency control respectively, then they can't be modularly composed to get a serializable global system for transactions that span objects from both systems.
> For many years, database users did not have to simultaneously understand the concept of isolation levels and consistency levels
This statement might technically be true, but it hasn't been the case for many years, including much of the time that "CAP Theorem" has been a part of popular discourse in industry. If you take a traditional RDBMS with ACID transactions and ANSI defined isolation levels such as Postgres without extensions (single master) you can still have inconsistent reads if your architecture includes async streaming replicas used for distributing reads. This feature was added in version 9.0 - 10 years ago. Microsoft SQL Server had it at least a decade before that.
While interviewing candidates at my last gig, I was taken aback by how many of them confused consistency with isolation in databases. Most who came from a Mongo background did not understand isolation at all and confused it with consistency.
Concepts such as Optimistic locking and Pessimistic locking are also not well-known any more, either, much to my dismay :(
> "For many years, database users did not have to simultaneously understand the concept of isolation levels and consistency levels. Either a database system provided a correctness/performance tradeoff using isolation levels, or it provided a correctness/performance tradeoff using consistency levels, but never both. This resulted in a blurring of these concepts to the point that many people --- even experts in the field --- confuse isolation levels with consistency levels and vice versa. For example, this talk by a PhD student at Berkeley (incorrectly) refers to causal consistency as an “isolation level”. And this paper from well-known researchers at MIT and Harvard --- including a Turing Award laureate --- (incorrectly) call snapshot isolation and serializability “consistency” levels. I am confident that all these well-known researchers know the difference between isolation levels and consistency levels. However, as long as isolation and consistency could not be tuned within the same system, there has been little necessity for precision in the parlance around these terms."
So I wouldn't say it's uncommon or unexpected of non-experts to confuse the two concepts...
Haha, but seriously. I didn't mean why you were taken aback given this article; I meant why you were taken aback given most developers do not understand this distinction, seldom have the real need to, and even experts are fuzzy on the details.
Maybe I misread your tone, but to me it was like "programmers don't know how to program" (like in the FizzBuzz anecdotes).
Were these for candidates who had previously claimed to know databases (or were applying for positions that assumed such knowledge)? Or did you expect this of all candidates?
Knowledge of SQL databases at which level? DBA level? Programmer level?
The distinction is important because when recruiters ask "do you know SQL databases?", 99% of the time they mean "have you used one, and can you write a query with joins?" or "do you know how to create indexes?". The almost never mean "do you understand how a database is implemented?" or "do you know the difference between consistency and isolation levels?".
This statement might technically be true, but it hasn't been the case for many years, including much of the time that "CAP Theorem" has been a part of popular discourse in industry. If you take a traditional RDBMS with ACID transactions and ANSI defined isolation levels such as Postgres without extensions (single master) you can still have inconsistent reads if your architecture includes async streaming replicas used for distributing reads. This feature was added in version 9.0 - 10 years ago. Microsoft SQL Server had it at least a decade before that.
Concepts such as Optimistic locking and Pessimistic locking are also not well-known any more, either, much to my dismay :(
> "For many years, database users did not have to simultaneously understand the concept of isolation levels and consistency levels. Either a database system provided a correctness/performance tradeoff using isolation levels, or it provided a correctness/performance tradeoff using consistency levels, but never both. This resulted in a blurring of these concepts to the point that many people --- even experts in the field --- confuse isolation levels with consistency levels and vice versa. For example, this talk by a PhD student at Berkeley (incorrectly) refers to causal consistency as an “isolation level”. And this paper from well-known researchers at MIT and Harvard --- including a Turing Award laureate --- (incorrectly) call snapshot isolation and serializability “consistency” levels. I am confident that all these well-known researchers know the difference between isolation levels and consistency levels. However, as long as isolation and consistency could not be tuned within the same system, there has been little necessity for precision in the parlance around these terms."
So I wouldn't say it's uncommon or unexpected of non-experts to confuse the two concepts...
Maybe I misread your tone, but to me it was like "programmers don't know how to program" (like in the FizzBuzz anecdotes).
The distinction is important because when recruiters ask "do you know SQL databases?", 99% of the time they mean "have you used one, and can you write a query with joins?" or "do you know how to create indexes?". The almost never mean "do you understand how a database is implemented?" or "do you know the difference between consistency and isolation levels?".
The article throws some light on consistency and isolation levels in modern distributed relational databases (aka NewSQL)