Instructor: Feifei Li
[Overview] [Announcements] [Syllabus] [Textbook] [Schedule] [Slides] [Written Assignments] [Project] [Youtube of Our Lectures] [Additional Resources] [Contact]
Graduate-level course on the design and implementation of (relational) database
system kernels, as well as other large-scale data management techniques and systems. Review the relational data
model (including relational algebra) and relational query language: SQL. Examine in depth file organization,
database storage, indexing and hashing, query evaluation and optimization, transaction
processing, concurrency control and recovery, database integrity and security (if schedule allows).
In addition to the study of relational database kernels, this course also investigates latest development
in other large-scale data management techniques and systems, e.g., the MapReduce framework
(in particular, the Hadoop system), NoSQL systems, Key-Value stores (Cassandra, HBase, Google BigTable), other IO efficient techniques (if time permits),
and in-memory computing platform like Spark.
Students will participate in a semester-long project and build a mini-database system by implementing
several core modules in a relational database system. There might also be projects on other large-scale
data management techniques, such as MapReduce-based projects, etc., if time allows. In
summary, this course is about the principles of designing and implementing database kernels, as well as
other relevant large data management techniques. Please note that this is NOT a course on building
database applications and introduction to database systems, i.e., we will not cover in this course how to
build a database application (e.g., ER design, schema refinement, functional dependency, and database
application development). Such topics will be covered in CS 5530.
08/22/16: Course website is up. Course syllabus is ready for review.
08/23/16: Quick review of Java: Java 1, Java 2
08/25/16: Exploring SQL through examples and animations SQL ZOO
08/30/16: HW1 will be based on the sample database from PostgreSQL tutorial postgresql tutorial; in particular, please
go over the sample database schemas.
8/31/16: A open-source PostgreSQL client program that has a GUI is pgadmin.
9/08/16: The Java DOC for SimpleDB is available here: SimpleDB Java DOC.
Syllabus for the course in PDF format:
Database Management System by R. Ramakrishnan and J. Gehrke, 3rd Edition.
Details and additional material supporting this book could be found
here (we are using the Third edition).
Complementary Reading: Database Systems, The Complete Book, 2nd Edition.
Complementary Reading: The Red Book: Readings in Database Systems, 5th Edition, www.redbook.io
Both books have been reserved at the library.
Lecture: TH 12:25pm to 1:45pm, WEB 1250.
Office Hour: T & H 2:00pm to 3:30pm, MEB 3464.
TA: Guineng Zheng and Jiyuan Li (emails at the bottom of the page)
Guineng, Office Hour: M, 1:00pm-2:30pm, W, 9:30am-11:00am. Office: MEB 3115
Jiyuan, Office Hour: Th, 9:30am-11:00am, F, 1:00pm-2:30pm. Office at MEB 3115
Exam: Final: December 12, Monday 10:30am-12:30pm in class.
Drop: Last day to withdaw the class: Friday, 10/21/2016.
Detailed weekly schedule will be posted in the course syllabus.
Slides will be posted before each lecture. Updates to slides may happen after
the lecture.
Lecture 1: Course Administration and Introduction, Overview of Database
Systems
Lecture 2: Relational Model
Lecture 3: Relational Algebra
Lecture 4: Tuple Relational Calculus
Lecture 4: SQL
Lecture 5: Storage Engine, Buffer, Files
Lecture 6: SimpleDB Overview
Lecture 7: Access Paths and Access Methods
Lecture 8: Hashing Index
Lecture 9: Tree Index
Lecture 10: External Sort
Lecture 11: Query Evaluation
Lecture 12: Query Optimization
Lecture 13: Transaction
Lecture 14: Concurrency Control
Lecture 15: Crash Recovery
Lecture 16: Database warehouse, MapReduce, Spark
Lecture 17: Spark
Lecture 18: NoSQL
Lecture 19: Column Store
1. Assignment will be posted after it is announced in the class.
2. Solution will be posted once the assignment due date is passed.
3. To work on more excises on your own, the solutions to all odd-numbered questions in the textbook are available here.
1. Project description will be available when it is announced in the class.
2. Sample solution to the project will not be posted, however, the output for the test traces will be available.
Database Research: ACM SIGMOD VLDB IEEE ICDE
Database Products: ORACLE Microsoft SQL Server IBM DB2 PostGre MySQL
Feifei
Li TA: Guineng Zheng TA: Jiyuan Li