Spring 2004: CS 7960-4 Special Topics: High-Performance Architectures
General Information:
- Venue: MEB 1208
- Time: Tuesday, Thursday 2:00-3:20pm
- Instructor: Rajeev Balasubramonian, email: rajeev, MEB 3190J, office hours by appointment
- Pre-Requisite: CS6810 or equivalent
- Class mailing list: cs7960-004@cs.utah.edu. Visit the mailman system to sign up or modify.
Overview:
CS7960 is a graduate course intended for graduate and advanced
undergraduate students interested in exploring important issues in
the design of modern microprocessors. This course picks up where
CS6810 leaves off, with an in-depth treatment of hot research topics
in the field. Each week, we will cover 2-3 seminal papers that advanced
the state-of-the-art in the past decade. Students will be expected to
read the papers before the instructor presents it in the class.
Students taking the course for credit will have to complete a project.
Grading
The following is a tentative guideline and may undergo changes.
50% of the course grade will depend on the project. This will involve
the use of architectural simulators to evaluate and analyze novel
research ideas. Students doing a good job on this will be encouraged
to submit their papers to one of the ISCA workshops (May deadlines), and
accepted papers will earn an all-expense paid trip to Munich, Germany. :-)
A final exam will account for 20% of your grade. 5% will be based on class
participation. The remaining 25% will be based on paper critiques you
submit at the start of each lecture (a standard template questionnaire
will be provided). The questionnaire will test your ability to think
critically about the problem at hand.
List of Topics
The following is a tentative list of topics. This may undergo
changes depending on student interests.
- Overview of computer microarchitecture
- Limits of ILP
- Technology trends
- Clustered microarchitectures
- Branch prediction and instruction fetch
- Memory hierarchy
- Register file design
- Power issues
- Simultaneous Multithreading
- Chip Multiprocessors
- Processor case studies
- Reliability, VLIW, DRAM, value prediction, etc.
Class Schedule
Introduction
- Tu 13th Jan:
Logistics. Computer microarchitecture overview from
"The Microarchitecture of Superscalar Processors" ,
J.E. Smith and G.S. Sohi, Proceedings of the IEEE, December 1995.
No prior reading required.
Slides.
ILP and Technology Trends
- Th 15th Jan:
"Limits of Instruction-Level Parallelism" ,
David W. Wall, WRL Research Report 93/6, November 1993.
Slides.
Questionnaire.
- Tu 20th Jan:
"Complexity-Effective Superscalar Processors" ,
S. Palacharla, N.P. Jouppi, J.E. Smith, Proceedings of ISCA-24, June 1997.
Slides.
Questionnaire.
- Th 22nd Jan:
"Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures" ,
V. Agarwal, M.S. Hrishikesh, S.W. Keckler, D. Burger, Proceedings of ISCA-27, June 2000.
Slides.
Questionnaire.
Clustered Architectures
- Tu 27th Jan:
"Dynamic Code Partitioning for Clustered Architectures" ,
R. Canal, J-M. Parcerisa, A. Gonzalez, International Journal of Parallel Programming, vol.29(1), February 2001.
Slides.
Questionnaire.
Supplementary reading:
"Dynamically Managing the Communication-Parallelism Trade-Off in Future Clustered Processors" ,
R. Balasubramonian, S. Dwarkadas, D.H. Albonesi, Proceedings of ISCA-30, June 2003.
Pipelining
- Th 29th Jan:
"The Optimal Logic Depth Per Pipeline Stage is 6 to 8 FO4 Inverter Delays" ,
M.S. Hrishikesh, N.P. Jouppi, K.I. Farkas, D. Burger, S.W. Keckler, P. Shivakumar, Proceedings of ISCA-29, May 2002.
Slides.
Questionnaire.
Branch Prediction and Instruction Fetch
- Tu 3rd Feb:
"Combining Branch Predictors" ,
Scott McFarling, WRL Technical Note TN-36, June, 1993.
Slides.
Questionnaire.
- Th 5th Feb:
"The Impact of Delay on the Design of Branch Predictors" ,
D.A. Jimenez, S.W. Keckler, C. Lin, Proceedings of MICRO-33, December 2000.
Slides.
Questionnaire.
"Design Tradeoffs for the Alpha EV8 Conditional Branch Predictor" ,
A. Seznec, S. Felix, V. Krishnan, Y. Sazeides, Proceedings of ISCA-29, May 2002.
- Tu 10th Feb:
"Trace Cache: A Low Latency Approach to High-Bandwidth Instruction Fetching" ,
E. Rotenberg, S. Bennett, J.E. Smith, Proceedings of MICRO-29, December 1996.
Slides.
Questionnaire.
Memory Hierarchy
- Th 12th Feb:
"Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers" ,
N.P. Jouppi, Proceedings of ISCA-17, May 1990.
Retrospective on the paper, written in 1998.
Slides.
Questionnaire.
- Tu 17th Feb:
"Memory Dependence Prediction Using Store Sets" ,
G. Chrysos and J. Emer, Proceedings of ISCA-25, June 1998.
Slides.
Questionnaire.
- Th 19th Feb:
"Effective Hardware-Based Prefetching for High-Performance Microprocessors" ,
T.F. Chen and J.L. Baer, IEEE Transactions on Computers, May 1995.
Slides.
Questionnaire.
- Tu 24th Feb:
"Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors" ,
O. Mutlu, J. Stark, C. Wilkerson, Y.N. Patt, Proceedings of HPCA-9, February 2003.
Slides.
Questionnaire.
Register Files
- Th 26th Feb:
"Delaying Physical Register Allocation Through Virtual-Physical Registers" ,
T. Monreal, A. Gonzalez, M. Valero, J. Gonzalez, V. Vinals, Proceedings of MICRO-32, November 1999.
Slides.
Questionnaire.
- Tu 2nd Mar:
Student project discussion
Power
- Th 4th Mar:
"Pipeline Gating: Speculation Control for Energy Reduction" ,
S. Manne, A. Klauser, D. Grunwald, Proceedings of ISCA-25, June 1998.
Slides.
Questionnaire.
- Tu 9th Mar:
"Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power" ,
S. Kaxiras, Z. Hu, M. Martonosi, Proceedings of ISCA-28, July 2001.
Slides.
Questionnaire.
- Th 11th Mar:
"Reducing Power with Dynamic Critical Path Information" ,
J.S. Seng, E.S. Tune, D.M. Tullsen, Proceedings of MICRO-34, December 2001.
Slides.
Questionnaire.
- Tu 16th Mar:
Spring break
- Th 18th Mar:
Spring break
SMT-CMP
- Tu 23rd Mar:
"Simultaneous Multithreading: Maximizing On-Chip Parallelism" ,
D.M. Tullsen, S.J. Eggers, H.M. Levy, Proceedings of ISCA-22, June 1995.
Slides.
Questionnaire.
- Th 25th Mar:
"Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor" , D.M. Tullsen, S.J. Eggers, J.S. Emer, H.M. Levy, J.L. Lo, R.L. Stamm, Proceedings of ISCA-23, May 1996.
Slides.
Questionnaire.
- Tu 30th Mar:
"Initial Observations of the Simultaneous Multithreading Pentium 4 Processor" ,
N. Tuck and D.M. Tullsen, Proceedings of PACT-12, September 2003.
Slides.
Questionnaire.
- Th 1st Apr:
"The Case for a Single-Chip Multiprocessor" ,
K. Olukotun, B.A. Nayfeh, L. Hammond, K. Wilson, K-Y. Chang, Proceedings of ASPLOS-VII, October 1996.
Slides.
Questionnaire.
- Tu 6th Apr:
"The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization" ,
J.G. Steffan and T.C. Mowry, Proceedings of HPCA-4, February 1998.
Slides.
Questionnaire.
Processor Case Studies
- Th 8th Apr:
"The Microarchitecture of the Pentium4 Processor" ,
G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, P. Roussel, Intel Technology Journal, Q1, 2001.
Slides.
Questionnaire.
Miscellaneous Topics
- Tu 13th Apr:
"Exceeding the Dataflow Limit via Value Prediction" ,
M.H. Lipasti and J.P. Shen, Proceedings of MICRO-29, December, 1996.
Slides.
Questionnaire.
- Th 15th Apr:
"Wire Delay is not a Problem for SMT (in the near future)" ,
Z. Chishti and T.N. Vijaykumar, Proceedings of ISCA-31, June, 2004.
Slides.
- Finals: Take-home exam, to be handed out on Apr 10th.