CS 6960: Database Kernels and Large Data Managment, School of Computing, University of Utah

Instructor: Feifei Li

[Overview] [Announcements] [Syllabus] [Textbook] [Schedule] [Slides] [Written Assignments] [Project] [Additional Resources] [Contact] 

Overview

Graduate-level course on the design and implementation of (relational) database system kernels, as well as other large-scale data management techniques. Reviews the relational data model (including relational algebra) and relational query language: SQL. Examines in depth file organization, database storage, indexing and hashing, query evaluation and optimization, transaction processing, concurrency control and recovery, database integrity and security (if schedule allows). In addition to the study of relational database kernels, this course also investigates latest development in other large-scale data management techniques, e.g., streaming algorithms, the MapReduce framework (in particular, the Hadoop system), and other IO efficient techniques (if time permits). Students will participate in a semester-long project and build a mini-database system by implementing several core modules in a relational database system. There might also be projects on other large-scale data management techniques, such as sketching, MapReduce-based projects, etc., if time allows. In summary, this course is about the principles of designing and implementing database kernels, as well as other relevant large data management techniques. Please note that this is NOT a course on building database applications and introduction to database systems, i.e., we will not cover in this course how to build a database application (e.g., ER design, schema refinement, functional dependency, and database application development). Such topics will be covered in CS 5530/ 6530.

Announcements

        08/08/11: Course website is up. Course syllabus is ready for review.
        09/05/11: HW1 is due on 09/07/11.
        09/07/11: HW2 is due on 09/14/11.
        09/15/11: HW2 solution is now released.
        09/15/11: The Makeup lecture is 9:20-10:20am, 09/16/11, Friday.
        09/16/11: Project 1 is now released.
        09/18/11:IMPORTANTIt is VERY IMPORTANT to get yourself familiar with the MiniBase system that you will be implementing throughout the semester. Please find the general information for MiniBase from HERE. The important things to read is "The Overview of Single-User MiniBase". Please make sure that you thoroughly read through the relevant documents following that link. In particular for project 1, you should understand the HeapFile related topic. In addition to that, you should also learn the Error protocol (including the details in Error Interface) in MiniBase system which will be useful for all projects throughout the course.
        09/26/11:HW3 and Project 2 have been released.
        09/28/11:Midterm will be on Wed, 10/05/11, in Class.
        10/03/11: HW3 solution has been released.
        10/03/11: Please sign up for the class mailing list at (click on subscribe) https://sympa.eng.utah.edu/sympa/info/cs6960. You may use any email address of your preference, just make sure it is one that you check regularly.
        10/03/11: Electronic submission for project 2 is enabled. You can submit by either of the following options: 1) Log in using your CADE account at: https://cgi.eng.utah.edu/webhandin; or 2) use the commond line on the linux machines (from the CADE lab): handin cs6960 projx /path/to/file (where x should be the project number, e.g., proj2, proj3, proj4, proj5, proj6).
        10/10/11: Project 3 is now released.
        10/26/11: HW4 is now released.
        11/07/11: Project 4 is now released.
        11/16/11: The due date for project 4 has been extended for a week.
        11/16/11: HW5 is now released.
        11/28/11: HW6 is now released.
12/08/11: HW6 solution is now released.

Syllabus

Syllabus for the course in PDF format: Please pay special attention to the pre-requisite requirement for the course.

Textbook

Database Management System by R. Ramakrishnan and J. Gehrke, 3rd Edition. Details and additional material supporting this book could be found here (we are using the Third edition).

Schedule

Lecture: MW 1:25pm to 2:45pm, MEB 2325.

Office Hour: MW 10:30am to 11:30am.

Exam: Midterm: Wed, Oct 5th; Final: Thursday, Dec 15, 1-3pm.

Drop: Last day to withdaw the class: 10/21/2011.

Detailed weekly schedule will be posted in the course syllabus.

Slides

Slides will be posted before each lecture. Updates to slides may happen after the lecture.
Lecture 1: Course Administration and Introduction, Overview of Database Systems
Lecture 2: Relational Model
Lecture 3: Relational Algebra
Lecture 4: SQL-1 
Lecture 5: SQL-2  
Lecture 6: SQL-3 
Lecture 7: Disks and Files 
Lecture 8: File Organizations and Indexing
Lecture 9: Tree-Based Indexing
Lecture 10: Hash-Based Indexing
Lecture 11: External Sorting
Lecture 12: Single Table Query Evaluation
Lecture 13: Query Evaluation
Lecture 14: Query Optimization
Lecture 15: Concurrency Control
Lecture 16: Concurrency Control More
Lecture 17: Crash Recovery

Written Assignments

1. Assignment will be posted after it is announced in the class.

2. Solution will be posted once the assignment due date is passed.

Enter your password:

WA1: PDF FILE HERE, Due: 09/07/11, Wednesday, In Class.
WA2: PDF FILE HERE, Due: 09/14/11, Wednesday, In Class.
WA3: PDF FILE HERE, Due: 10/03/11, Wednesday, In Class.
WA4: PDF FILE HERE, Due: 11/02/11, Wednesday, In Class.
WA5: PDF FILE HERE, Due: 11/28/11, Monday, In Class.
WA6: PDF FILE HERE, Due: 12/07/11, Wednesday, In Class.

Project

1. Project description will be available when it is announced in the class.

2. Sample solution to the project will not be posted, however, the output for the test traces will be available.

Projcet1: Implementing HeapPage for the disk manager in the DBMS.
Projcet2: Implementing the buffer manager in the DBMS.
Projcet3: Implementing the disk-based B+ tree in the DBMS.
Projcet4: Implementing the external merge sort in the DBMS.

Additional Resources

Database Research: ACM SIGMOD  VLDB  IEEE ICDE

Database Products: ORACLE   Microsoft SQL Server  IBM DB2  PostGre  MySQL

Contact

Feifei Li