Reading List
CMP Cache Design
1. An adaptive, non-uniform cache structure for wire-delay dominated
on-chip caches. Kim, Burger, and Keckler. ASPLOS-2002.
2. A NUCA substrate for flexible CMP Cache sharing. Huh, Kim, Shafi,
Zhang, Burger, and Keckler. ICS-2005.
3. Managing wire delay in large chip multiprocessor caches. Beckmann
and Wood. MICRO-2004.
4. Nahalal: Cache organization for chip multiprocessors. Guz, Keidar,
Kolodny, and Weiser. CAL-2007.
5. Cooperative caching for chip multiprocessors. Chang and Sohi.
ISCA-2006.
6. An adaptive shared/private NUCA cache partitioning scheme for chip
multiprocessors. Dybdahl and Stenstrom. HPCA-2007.
7. A NUCA model for embedded systems cache design. Foglia, Mangano, and
Prete.
8. ASR: Adaptive selective replication for CMP caches. Beckmann, Marty,
and Wood. MICRO-2006.
9. Utility-based cache partitioning: A low-overhead, high-performance,
runtime mechanism to partition shared caches. MICRO-2006.
10. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. HPCA - 2008.
11.
Molecular caches: A caching structure for dynamic creation of
application-specific heterogeneous cache regions. MICRO - 2006.
12. Adaptive set pinning: Managing shared caches in chip multiprocessors. ASPLOS - 2008.
Exploring the design space of future CMPs. PACT - 2001.
Exploring the cache design space for large scale CMPs. dasCMP - 2005
Organizing the last line of defence before hitting the memory wall for CMPs. HPCA - 2004.
Communist, utilitarian, and capitalist cache policies on CMPs: Caches as a shared resource. PACT - 2006.
Victim Replication: Maximizing capacity while hiding wire delay in
tiled chip multiprocessors. Zhang and Asanovic. ISCA-2005.
Optimizing replication, communication and capacity allocation in CMPs.
Chisti, Powell, and Vijaykumar. ISCA-2005.
Distance associativity for high-performance energy-efficient
non-uniform cache
architectures. Chisti, Powell, and Vijaykumar. MICRO-2003.
Dynamic partitioning of shared cache memory. Suh, Rudolph, and Devadas.
Jnl. of supercomputing-2004.
Predicting inter-thread contention on a chip multi-processor architecture. HPCA - 2005.
Architectural support for operating system-driven CMP cache management. PACT - 2006.
Just say no: Benefits of early cache miss determination. HPCA - 2003.
Datacenter-on-chip Architectures: Tera-scale opportunities and
challenges. Iyer et. al. Intel Tech. Journal-2007
The V-Way Cache : Demand-based associativity via global replacement.
Qureshi, Thompson, and Yale Patt. ISCA-2005.
A case for MLP-aware cache replacement. Qureshi, Lynch, Mutlu, and Yale
Patt. ISCA-2006.
Interconnects
1. Interconnections in multi-core architectures: Understanding
mechanisms, overheads and scaling. Kumar, Zyuban, and Tullsen.
ISCA-2005.
2. Interconnect design considerations for large NUCA caches.
Muralimanohar, and Balasubramonian. ISCA-2007.
3. Interconnect-aware coherence protocols for chip multiprocessors.
Cheng, Muralimanohar, Ramani, Balasubramonian, and Carter. ISCA-2006.
4. Leveraging wire properties at the microarchitectural level.
Balasubramonian, Muralimanohar, Ramani, Cheng, and Carter. MICRO-2006.
5. Microarchitectural wire management for performance and power in
partitioned architectures. Balasubramonian, Muralimanohar, Ramani, and
Venkatachalapathy. HPCA-2005.
Misc.
1. The
Landscape of Parallel Computing Research: A View from Berkeley.
Asanovic et. al. 2006
2. Recognition, Mining
and Synthesis Moves Computers to the Era of Tera. Dubey. Intel Tech
Magazine.-2005.
Maximizing CMP throughput with mediocre cores. Davis, Laudon, and
Olukotun. PACT-2006.
Dataflow predication. Smith et. al. MICRO-2006.
Molecular caches: A caching structure for dynamic creation of
application-specific heterogeneous cache regions. Varadarajan et. al.
MICRO-2006.
Computation spreading: Employing hardware migration to specialize CMP
cores on-the-fly. Chakraborty, Well, and Sohi. ASPLOS-2006.
Kshitij Sudan
Last modified: Feb. 8, 2008.