A Taxonomy of Multiprocessor Memory-Ordering Models The memory subsystem of a multiprocessor may implement one of a variety of sets of semantics regarding the behavior of memory when its data may be accessed concurrently. Such semantics determine the system's memory-ordering model, so called because it specifies the orders in which processors may perceive the memory-access operations. One of the simplest memory-ordering models is sequential consistency, defined by Lamport in 1979. It is easy to describe and corresponds closely to many programmers' intuitive understanding of how shared memory operates. In addition, sequential consistency is a strong model. This means that, for any particular multithreaded program, the number of allowed behaviors is relatively small. In general, it is easier to reason about programs when run with stronger models, which makes them more desirable. Unfortunately, it is not possible to implement sequential consistency in a way that performs well; this fact becomes only more true as multiprocessor size increases and as the speed of processors (relative to that of memory) increases. Developers of multiprocessors use a variety of techniques to improve their performance, and many of these techniques result in memory-ordering models that are not sequentially consistent. Such models include those provided by the SPARC architecture (TSO, PSO, and RMO); that provided by Intel's IA32 (Pentium-based) multiprocessors; and that provided by the Itanium architecture jointly developed by Intel and Hewlett-Packard. These models are all weaker than sequential consistency. There has been concern of late that models weaker than sequential consistency are too difficult to program. In part, this is because these models allow more behaviors and are thus harder to reason about. Another reason is that, while some of these models differ only in very subtle ways, many of them have not been specified with precision sufficient for programmers to utilize them properly. Instead, they are often specified only by describing their most recent implementations. As a result, programmers develop either software that turns out to be incorrect for reasons difficult to understand or software that is written conservatively and that thus does not realize the performance advantages that the weaker memory-ordering model was developed to support. This presentation considers a progression of memory-ordering models, from ones with very simple definitions to ones that are quite complex. While the time limitations do not allow a precise formal specification of each model, the presentation is based on such specifications, all of which were developed within a common framework. In addition, the presentation will identify, where appropriate, some common (and well-known) techniques used in implementing multiprocessor memory systems. While the specification of memory-ordering models is (or should be) implementation-independent, implementation issues do inform model design, and an understanding of these issues can aid in the understanding why models are specified as they are and how certain models differ. Time permitting, the presentation will include a number of different multithreaded executions that illustrate the differences between the different models.