|Title of host publication||The International Symposium on High-Performance Computer Architecture|
|Subtitle of host publication||Orlando, Florida|
|Number of pages||12|
|State||Published - Feb 2014|
Traditional directory coherence protocols are designed for the strictest consistency model, sequential consistency (SC). When they are used for chip multiprocessors (CMPs) that support relaxed memory consistency models, such protocols turn out to be unnecessarily strict. Usually this comes at the cost of scalability (in terms of per core storage), which poses a problem with increasing number of cores in today’s CMPs, most of which no longer are sequentially consistent. Because of the wide adoption of Total Store Order (TSO) and its variants in x86 and SPARC processors, and existing parallel programs written for these architectures, we propose TSO-CC, a cache coherence protocol for the TSO memory consistency model. TSO-CC does not track sharers, and instead relies on self-invalidation and detection of potential acquires using timestamps to satisfy the TSO memory consistency model lazily. Our results show that TSO-CC achieves average performance comparable to a MESI directory
protocol, while TSO-CC’s storage overhead per cache line scales logarithmically with increasing core count.