Project #1: Contention Hotspot

Project Overview

The first programming project will teach you how to identify and fix a hotspot under concurrent workloads in an in-memory hash table. The primary goal of this assignment is to become familiar with the low-level implementation details of high-performance hash tables and to learn how to use profiling tools like PERF. All the code in this programming assignment must be written in C. If you have not used C before, here's a short tutorial on the language. Even if you are familiar with C, go over this guide for additional information on writing code in the system.

This is a single-person project that will be completed individually (i.e., no groups).

Release date: Thu, September 1
Due date: Thu, September 15

Implementation Details

You can refer to this PERF examples documentation or this PERF tutorial on how to use PERF to profile the system. You can also refer to the thread-local storage concept taught in class to reduce the contention in the system.

In this assignment, you will need to modify the following files:

include/iceberg_table.h
src/iceberg_table.c

You will not need to make any changes to any other file in the system. You can locally modify the main benchmark already included in the system to verify the correctness/performance of your implementation. But you will not submit those files.

You will also need to write a report on how you identified the hotspot with PERF profiling and how the profiling changes after your fix.

There are three steps to identify and fix the hotspot for the concurrent read workload in the DBMS:

Analyze the PERF Profiling Results
Reduce the Contention with Per-Thread Data Structures
Test, Profile again, and Evaluate the Performance

Step #1 - Analyze the PERF Profiling Results

The first step is to build the main benchmark and run PERF against the ./main benchmark:

make main
perf record ./main

size: the log of the number of items in the hash table. E.g., to insert 2^24 items use 24.
nthreads: number of threads.

In order to properly observe the contention bottleneck, you will need a machine with 8 cores. This particular hotspot generally shows up with higher than 8 threads, which means that you are unlikely to observe its effects on your laptop.

The main performs three types of operations, insert,queries, and deletes. You should run the benchmark with increasing number of threads (1,2,4,8) and observe which operation is not scaling. This will help you identify the operation that has the contention.

The PERF command will generate a result file perf.data in the folder that you run PERF command. Then you need to use PERF again to analyze the profiling result.

perf report

The percentages of the sampled on-CPU functions will show up in the window. You will see three functions in the main.cc are the at the top (except for the kernel scheduling function), which are both higher than 10%. You will then select into the three hottest functions in the PERF results and examine the annotated code from PERF. The contention comes from some shared resources in the iceberg_table.c that are protected by a lock. But it is your job to identify which resources they are and which lock it is.

HINT: You will need to submit screenshots of your PERF analysis in your report (see Submission).

Step #2 - Reduce the Contention with Per-Thread Data Structures

You next need to fix the contention hotspot on the resources in iceberg_table.c by using another, highly concurrent version of the same resource. The highly concurrent version of the resource is already available as part of the source code given to you.

Having per-thread data structures does not mean that there cannot be concurrent operations on those data structures. It just reduces the level of contention. When there are concurrent operations on the per-thread data structures, you still need to protect them appropriately. You will not get any point for the programming part of the project if your solution does not guarantee correctness.

Step #3 - Test, Profile again, and Evaluate the Performance

You need to make sure that your implementation is correct before proceed to evaluation. We have implemented some unit tests and basic benchmarks in our system as part of the main benchmark. You should also extend the tests by writing your own test cases or scaling up the number of CPUs in the benchmarks.

Then you need to repeat the profiling process from Step #1 to verify that your implementation has reduced the on-CPU percentages of the hot functions in the iceberg_table.c. Finally you should compare the performance (throughput) numbers of the benchmark before and after your fix, which will be printed out on the terminal after you execute the benchmark.

HINT: You also need to submit screenshots of your new PERF analysis in your report (see Submission).

Instructions

You can download the Project #1 source code (as a tar file) from Canvas. It is uploaded under files. You can extract the source code using the following command:

tar -xzvf project1.tar.gz

To debug any corresctness issues, you can compile the main benchmark using -D flag to turn off optimizations.

make clean
make D=1 main

Please refer to this paper for the details of the iceberg hash table. IcebergHT Paper

You will use the Cade cluster to finish this project.

CADE manages clusters that you can use to do your development and testing for all of the class projects. You are free to use other machines and environments, but all grading will be done on these machines. Please test your solutions on these machines.

Check with CADE if you need to setup an account.

CADE machines all share your home directory, so you needn't log in to the same machine each time to continue working.

After you have an account choose a machine at random from the lab status page from the lab1- set of machines (that is, lab1-1.eng.utah.edu through lab1-40.eng.utah.edu).

ssh lab1-10.eng.utah.edu

CADE user accounts have tcsh set as their default shell. Each time you login first run bash before anything else. All instructions, examples, and scripts from this class assume you are using bash as your shell. You'll need to do this each time unless you reset your default shell ( link) (which I'd recommend). Perhaps, savvy users can provide slick setups. This step is important. If you don't reset your shell, other things will mysteriously break as you try to work through the labs.

PERF and other essential software are installed on all Cade lab1 machines.

Submission

You need to submit a tar.gz file of your source code to canvas.

You should also include a report.pdf in your submission that contains:

A screenshot of the PERF profiling results which shows the two hottest functions in the main before your fix.
A screenshot of the PERF profiling results which shows the bottleneck in any one of the above two functions with the annotated code before your fix.
A brief analysis on how you identify the hotspot and the resources under contention with the help of the above profiling results.
A screenshot of the PERF profiling results which shows the new percentages of the on-CPU functions after your fix.
A screenshot of the PERF profiling results which shows the new bottleneck in any one of the original two hottest functions with the annotated code after your fix.
A brief analysis on how your implementation helps reduce the contention hotspot in the system with the evidence in the above two screenshots.

We will evaluate the correctness and the performance of your implementation off-line after the project due date.

Collaboration Policy

Every student has to work individually on this assignment.
Students are allowed to discuss high-level details about the project with others.
Students are not allowed to copy the contents of a white-board after a group meeting with other students.
Students are not allowed to copy the solutions from another colleague.