Cache Coherence

Here's a problem that we neatly sidestepped while talking about cache: what happens if there is a bus transaction that modifies memory, while data is in the cache? And what happens if data is in the cache, and we attempt a DMA transfer of the data from memory out to a device? This problem is called maintaining cache consistency or cache coherency.

This same problem exists when we have multiple processors sharing data, since we don't have the same degree of control over other processors that we do over I/O devices - we can do a lot to resolve the problem for DMA devices by simply using cache invalidate instructions to make sure the situation doesn't arise. We really can't do that with other processors - with I/O devices, we know what memory locations they're going to access (after all, we told them to do it). With other processors, it's fairly random.

The most intuitive model of memory consistency is called ``sequential consistency.'' In sequential consistency, a global ordering is imposed on all memory accesses, and all processors get the same results as if memory accesses took place in a large global memory in this order, with no caches present.

The simplest way to implement a sequentially consistent memory model is to use a ``snoopy cache.'' A snoopy cache works in analogy to your snoopy next door neighbor, who is always watching to see what you're doing, and interfering with your life. In the case of the snoopy cache, the caches are all watching the bus for transactions that affect blocks that are in the cache at the moment. The analogy breaks down here; the snoopy cache only does something if your actions actually affect it, while the snoopy neighbor is always interested in what you're up to.

The text gives one possible snoopy cache protocol (in chapter 9, on the CD); As they state, there are other possibilities; we'll go ahead and discuss theirs, though. It's based on the same cache block states as the writeback caches we discussed earlier: invalid, clean, and dirty. For purposes of the protocol, we'll call these invalid, shared, and modified respectively. There are a few extra bus signals needed, though; these bus signals are generated whenever a cache block changes state. These are the signals the other caches are watching in order to decide what to do. The basic idea is that whenever you change the state of one of your cache blocks in a way that might be interesting to other caches in the system, you put extra signals on the bus to announce the fact.

Note: this is an example of a so-called MSI cache.

Here's a table describing the MSI protocol in the book. It's based on two state transition tables; one describes state changes as a result of local processor operations and also results in bus transactions; the other describes state changes as a result of bus transactions.

State Changes Based on CPU Actions
Old StateCPU OperationNew StateBus Operation
InvalidReadSharedRead
InvalidWriteModifiedRead, Invalidate
SharedReadSharednone
SharedWriteModifiedInvalidate
SharedFlushInvalidnone
ModifiedReadModifiednone
ModifiedWriteModifiednone
ModifiedFlushInvalidWrite

State Changes Based on Bus Activity
Old StateBus TransactionNew StateRes. Bus Transaction
SharedReadSharednone
SharedInvalidateInvalidnone
ModifiedReadInvalidWrite
ModifiedInvalidateInvalidWrite

Other Schemes

MESI, MOESI both try to reduce bus traffic by introducing extra states into the cache....
Last modified: Mon Nov 22 11:13:32 MST 2004