Parallel Programming (CS 4961)

Fall 2010

Schedule: Tues./Thur., 9:10-10:30 AM
Location: WEB L112
Instructor: Mary Hall
Office Hours: MEB 3466; Tuesdays, 10:45-11:15 AM; Wednesdays, 11:00-11:30 AM or by appointment
Mailing list:
TA: Sriram Aananthakrishnan Office hours: MEB 3115; Mondays and Wednesdays, 3:30-4:30 PM; lab hours on Tuesdays when announced

Jump to: [Background] [Grading] [Textbook] [Schedule] [Assignments] [Policies]

 Background and Description

This course is a comprehensive exploration of parallel programming paradigms, examining core concepts, focusing on a subset of widely used contemporary parallel programmingmodels, and providing context with a small set of parallel algorithms. In the last few years, this area has been the subject of significant interest due to a number of factors. Most significantly, the advent of multi-core microprocessors has made parallel computing available to the masses. At the high end, major vendors of large-scale parallel systems, including IBM, Cray and Sun, have recently introduced new parallel programming languages designed for applications that exploit tens of thousands of processors. Embedded devices can also be thought of as small multiprocessors. The convergence of these distinct markets offers an opportunity to finally provide application programmers with a productive way to express parallel computation.

The course will be structured as lectures, homeworks, programming assignments and a final project. Students will perform four programming projects to express algorithms using selected parallel programming models and measure their performance. The final project will consist of teams of 2-3 students who will implement codes by combining multiple programming models.

Prerequisites: CS 4400, or concurrent


35%Programming projects(P1, P2, P3, P4)
25%Written homeworks
25%Quiz and Final
15%Final project

Note:Late homeworks are not allowed. Late projects will incur a 20% penalty per day.


Principles of Parallel Programming by Calvin Lin and Lawrence Snyder (ISBN-10: 0-321-48790-7).

 Schedule (tentative)

The following schedule is subject to change with a week's notice, particularly as opportunities for guest lectures and conflicts arise. The readings listed should be completed ahead of class so that you can follow the lecture and respond to questions. Homeworks and programming assignments are due before class so that we can discuss them.

Date Topics Read Assign Notes
24 Aug Introduction (ppt) (pdf)
Importance of parallel programming
Chapter 1, pgs. 2-8, 25-26 - -
26 Aug Introduction to parallel algorithms and correctness (ppt) (pdf)
Concerns for parallelism correctness
Chapter 1, pgs. 8-24 HW01 Deps
31 Aug Parallel Computing Platforms and Models of Execution (ppt) (pdf)
A diversity of parallel architectures, taxonomy, and 6 examples
Chapter 2, pgs. 30-59 - -
Parallel Programming Model Concepts
02 Sep CTA, Data and Task Parallelism (ppt) (pdf)
Review Homework #1, Definition of Data and Task Parallelism
Part 2 and Ch. 4, pgs. 87-96 HW02 -
07 Sep Data Parallelism in OpenMP(ppt) (pdf)
Introduction to OpenMP and Parallel Loops
Chapter 6, pgs 145, 193-199 - -
09 Sep OpenMP, cont. and Introduction to Data-Parallel Algorithms (ppt) (pdf)
Data decomposition, scheduling and load balance, review HW02
Chapter 5, pgs. 112-132 - -
14 Sep Introduction to SIMD (ppt) (pdf)
Brief description of Sun T2 for P01
- P01 Sun Ultrasparc T2
16 Sep SIMD, cont. and Data Parallel Algorithms (ppt) (pdf)
Example: Intel SSE-3 Multimedia Extension, related SIMT for GPUs
- - -
21 Sep Red/blue and introduction to data locality (ppt) (pdf)
Discuss programming assignment, work through red/blue, define reuse/locality
- - -
23 Sep Writing parallel code and data locality (ppt) (pdf)
Complete programming assignment
- - -
28 Sep Data Locality, cont. (ppt) (pdf)
Code restructuring techniques: permutation and tiling
- P02 -
30 Sep Data Locality, cont. (ppt) (pdf)
Code restructuring techniques: unroll-and-jam and scalar replacement
- - -
05 Oct Task Parallelism (ppt) (pdf)
Open MP 3.0, Task-Parallel Algorithms
Chapter 5, pgs. 132-142 - -
Reasoning about Performance
07 Oct Performance Concepts (ppt)
Speedup, parallelism overhead, locality
Chapter 3, pp. 61-85 - -
Parallel Programming for GPUs
19 Oct Introduction to GPUs and CUDA (ppt)
Architecture and execution model
- - -
21 Oct Programming in CUDA (pdf)
SIMT execution and divergent branches (Malik)
- - -
26 Oct -
Midterm Quiz I
- - -
28 Oct CUDA, cont. (ppt) (pdf)
Data placement, memory latency and bandwidth optimizations
- - -
Message Passing and Distributed Memory
02 Nov Introduction to Message Passing (ppt)
What is MPI? Complexities of a distributed address space
Chapter 7, pgs. 202-228 HW04 (Project proposal) -
04 Nov MPI Communication Operations (ppt) (pdf)
Blocking, Buffered, Non-blocking, One-sided
Chapter 2, p. 54-55 - -
Parallel Algorithms
09 Nov Key Concepts in Optimizing Sparse Algorithms (ppt) (pdf)
Sparse matrix and sparse graph representations
- - -
11 Nov Parallel Graph Algorithms (ppt) (pdf)
Shortest Path, Minimum Cost Spanning Tree
- - -
- - -
18 Nov A Brief Survey of Functional Parallelism
Guest Lecture, Matt Might
- - -
23 Nov Map Reduce (ppt) (pdf)
Examine contemporary functional programming model
- - -
30 Nov Future Directions for Parallel Computing (ppt)
Chapel and Transactional Memory
Chapter 9,10 - -
02 Dec Course Retrospective and Review (ppt)
Parallel Computing Research, intersection of HPC and commodity, future of the field
Chapter 11 - HW05 (project update)
Project Presentations
07 Dec Project Presentations, Dry Run
- - -
09 Dec Project Presentations Poster Session
- - -
14 Dec Final exam
- - -


All written homeworks are due on the Wednesday 8 days from when they are assigned. See the schedule above for due dates. Use handin , and submit PDF files. Programming assignments will sometimes be given more time if they require more depth. The final programming assignment will be a group project for teams of 1 to 3 students.

Written Homework

Programming Assignments Final Quiz Written take home final Due 5:00PM Thurs., Dec. 16; handin cs4961 final or deliver to MEB3466

Final Project The final project will be a group project with teams of 1 to 3 students. You will combine two of the parallel programming models we learned in class and develop a non-trivial application.

  • Project description
  • Poster presentation
  • Poster dry run in class, Tues., Dec. 7
  • Presentation in class, Thurs., Dec. 9
  • 2-4 page report summarizing poster and project completion and software, due 11:59PM, Tues. Dec. 14


    Laptops in class: Laptops should not be used in class, unless you have an ADA exemption. (see ADA policy below)

    Cheating: Any assignment or exam that is handed in must be your own work. It must be in your own words, and based on your own understanding of the solution. However, talking with one another to understand the material better is strongly encouraged. When taking an exam, you must work independently. Any collaboration during an exam will be considered cheating. Any student who is caught cheating will be given an E in the course and referred to the University Student Behavior Committee.

    ADA: The University of Utah conforms to all standards of the Americans with Disabilities Act (ADA). If you wish to qualify for exemptions under this act, notify the Center for Disabled Students Services, 160 Union.

    College guidelines: Document concerning adding, dropping, etc. here.