Parallel
Programming (CS 4961)
Fall 2010
|
This course is a comprehensive exploration of parallel programming paradigms, examining core concepts, focusing on a subset of widely used contemporary parallel programmingmodels, and providing context with a small set of parallel algorithms. In the last few years, this area has been the subject of significant interest due to a number of factors. Most significantly, the advent of multi-core microprocessors has made parallel computing available to the masses. At the high end, major vendors of large-scale parallel systems, including IBM, Cray and Sun, have recently introduced new parallel programming languages designed for applications that exploit tens of thousands of processors. Embedded devices can also be thought of as small multiprocessors. The convergence of these distinct markets offers an opportunity to finally provide application programmers with a productive way to express parallel computation.
The course will be structured as lectures, homeworks, programming assignments and a final project. Students will perform four programming projects to express algorithms using selected parallel programming models and measure their performance. The final project will consist of teams of 2-3 students who will implement codes by combining multiple programming models.
Prerequisites: CS 4400, or concurrent
35% | Programming projects(P1, P2, P3, P4) | |
25% | Written homeworks | |
25% | Quiz and Final | |
15% | Final project |
Principles of Parallel Programming by Calvin Lin and Lawrence Snyder (ISBN-10: 0-321-48790-7). |
Date | Topics | Read | Assign | Notes |
24 Aug |
Introduction (ppt) (pdf)
Importance of parallel programming |
Chapter 1, pgs. 2-8, 25-26 | - | - |
26 Aug |
Introduction to parallel algorithms and correctness (ppt) (pdf)
Concerns for parallelism correctness |
Chapter 1, pgs. 8-24 | HW01 | Deps |
31 Aug |
Parallel Computing Platforms and Models of Execution (ppt) (pdf)
A diversity of parallel architectures, taxonomy, and 6 examples |
Chapter 2, pgs. 30-59 | - | - |
02 Sep |
CTA, Data and Task Parallelism (ppt) (pdf)
Review Homework #1, Definition of Data and Task Parallelism |
Part 2 and Ch. 4, pgs. 87-96 | HW02 | - |
07 Sep |
Data Parallelism in OpenMP(ppt) (pdf)
Introduction to OpenMP and Parallel Loops |
Chapter 6, pgs 145, 193-199 | - | - |
09 Sep |
OpenMP, cont. and Introduction to Data-Parallel Algorithms (ppt) (pdf)
Data decomposition, scheduling and load balance, review HW02 |
Chapter 5, pgs. 112-132 | - | - |
14 Sep |
Introduction to SIMD (ppt) (pdf)
Brief description of Sun T2 for P01 |
- | P01 | Sun Ultrasparc T2 |
16 Sep |
SIMD, cont. and Data Parallel Algorithms (ppt) (pdf)
Example: Intel SSE-3 Multimedia Extension, related SIMT for GPUs |
- | - | - |
21 Sep |
Red/blue and introduction to data locality (ppt) (pdf)
Discuss programming assignment, work through red/blue, define reuse/locality |
- | - | - |
23 Sep |
Writing parallel code and data locality (ppt) (pdf)
Complete programming assignment |
- | - | - |
28 Sep |
Data Locality, cont. (ppt) (pdf)
Code restructuring techniques: permutation and tiling |
- | P02 | - |
30 Sep |
Data Locality, cont. (ppt) (pdf)
Code restructuring techniques: unroll-and-jam and scalar replacement |
- | - | - |
05 Oct |
Task Parallelism (ppt) (pdf)
Open MP 3.0, Task-Parallel Algorithms |
Chapter 5, pgs. 132-142 | - | - |
07 Oct |
Performance Concepts (ppt)
(pdf) Speedup, parallelism overhead, locality |
Chapter 3, pp. 61-85 | - | - |
19 Oct |
Introduction to GPUs and CUDA (ppt)
(pdf) Architecture and execution model |
- | - | - |
21 Oct |
Programming in CUDA (pdf)
SIMT execution and divergent branches (Malik) |
- | - | - |
26 Oct |
-
Midterm Quiz I |
- | - | - |
28 Oct |
CUDA, cont. (ppt) (pdf)
Data placement, memory latency and bandwidth optimizations |
- | - | - |
02 Nov |
Introduction to Message Passing (ppt)
(pdf) What is MPI? Complexities of a distributed address space |
Chapter 7, pgs. 202-228 | HW04 (Project proposal) | - |
04 Nov |
MPI Communication Operations (ppt) (pdf)
Blocking, Buffered, Non-blocking, One-sided |
Chapter 2, p. 54-55 | - | - |
09 Nov |
Key Concepts in Optimizing Sparse Algorithms (ppt) (pdf)
Sparse matrix and sparse graph representations |
- | - | - |
11 Nov |
Parallel Graph Algorithms (ppt) (pdf)
Shortest Path, Minimum Cost Spanning Tree |
- | - | - |
16 Nov |
CLASS CANCELLED
- |
- | - | - |
18 Nov |
A Brief Survey of Functional Parallelism
Guest Lecture, Matt Might |
- | - | - |
23 Nov |
Map Reduce (ppt) (pdf)
Examine contemporary functional programming model |
- | - | - |
30 Nov |
Future Directions for Parallel Computing (ppt)
(pdf) Chapel and Transactional Memory |
Chapter 9,10 | - | - |
02 Dec |
Course Retrospective and Review (ppt)
(pdf) Parallel Computing Research, intersection of HPC and commodity, future of the field |
Chapter 11 | - | HW05 (project update) |
07 Dec |
Project Presentations, Dry Run
- |
- | - | - |
09 Dec |
Project Presentations Poster Session
- |
- | - | - |
14 Dec |
Final exam
- |
- | - | - |