ECS 165B Spring 2011 - Database System Implementation

DavisDB Project

Project Overview

The main focus of the course is the DavisDB project. DavisDB is a simplified but complete single-user relational database management system written in C++. It involves a signficant amount of programming, to be carried out in teams of 2. (Singleton teams are allowed, but the project requirements are the same for both kinds of teams, and no special allowance is made in project grading.) The overall design of the project is highly-structured, but there is considerable room within the specification for student design and ingenuity. The project is divided into four parts:
  1. Record File Management Component. In this part, you will implement a set of functions for managing unordered files of database records. This component will rely on a lower-level Paged File component that we will provide. The Paged File component provides an in-memory buffer pool and performs low-level file I/O at the granularity of pages.

  2. Indexing Component. In this part, you will implement support for building B+-tree indices on records stored in unordered files.

  3. System Management Componenent. In this part, you will implement a potpouri of database and system utilities, including data definition commands and system catalog management. This component will rely on the Record Management and Indexing components, as well as a command-line parser which we will provide.

  4. Query Engine Component. In this part, you will implement a query evaluation engine for a fragment of SQL involving select-project-join queries and updates. This component relies on the three earlier components, and will also use the provided command-line parser.

Doxygen-generated documentation for the various classes and interfaces can be found here.

Computing Platform

Build platform. The standard build platform for DavisDB will be the 32-bit Linux workstations in the CSIF labs in the basement of Kemper Hall. You are free to develop your code on whatever platform you wish (e.g., Mac OS X laptop, Windows machine running Cygwin), but problems you encounter on such platforms will not be supported, and it is up to you to ensure that your implementation builds and runs properly on the lab machines.

Debugging. Use of gdb or an IDE with integrated debugger, such as Eclipse with CDT, is encouraged for this class. There are currently some issues with the version of Eclipse installed on the lab machines; see here for a temporary workaround that should get Eclipse working.

Subversion. We also will require all students to coordinate their team efforts, and submit their project components, via subversion. Subversion repositories have been set up for each group on the CSIF machines at /home/cs165b/CSIF-Proj/cs165b-X/trunk/DavisDB, where X is the number of your group. Access to the repositories is restricted (via Linux permissions) to the members of your group, the instructor, and the TA. To get started, from the CSIF machines, type svn co file:///home/cs165b/CSIF-Proj/cs165b-X/trunk/DavisDB (replacing X with the number of your group). For example, user green in group 0 would type this and see:

  [green@pc12 ~]$ svn co file:///home/cs165b/CSIF-Proj/cs165b-0/trunk/DavisDB
  A    DavisDB/RecordFileHandle.h
  A    DavisDB/PageFileHandle.h
  A    DavisDB/PageFileManager.cpp
  A    DavisDB/RecordManager.cpp
  ...
  A    DavisDB/Common.h
  Checked out revision 3.
As the output indicates, the result is that directory DavisDB is created underneath the current directory, with the code distribution files inside. The distribution can also be obtained from outside CSIF via ssh, using subversion's "svn+ssh" feature. For example, a user with CSIF account name green would type
  laptop:~ tjgreen$ svn co svn+ssh://green@pc12.cs.ucdavis.edu/home/cs165b/CSIF-Proj/cs165b-0/trunk/DavisDB
  A    DavisDB/RecordFileHandle.h
  A    DavisDB/PageFileHandle.h
  ...
  laptop:~
The particular machine name pc12 above doesn't really matter; it can be any of the CSIF machines. By default, the above command will prompt you for your CSIF password; since you be executing many more subversion commands later, you will probably want to follow these directions to set up secure password-free authentication.

If you have already started coding the first project component, after checking out the code from the repository, you will want to copy your versions of any changed files into the created directory, commit your changes via svn commit, and work in that directory from now on. Some basic subversion commands:

Use svn help for a full list of commands.

When working in teams, please do not email or scp files back and forth - use the repository instead! (That's the whole point.) Subversion keeps a full record of every version of every file (even deleted files), so keep in mind that you can always undo changes made accidentally.

Whether you are working alone or in a team, please remember to svn commit your changes regularly. If nothing else, subversion will at least serve as a file backup in case your hard drive fails. Also, don't forget to svn add any new files you introduce (and remember to update CMakeLists.txt too)

Submitting your code. Included in the code distribution is a shell script, submit.sh, that you will use to submit your code. The basic usage is "submit.sh n" where n is the number of the project component you are submitting. It must be run from the directory containing your code. After submitting your code, the script will also run a test build in a temporary directory to make sure that it compiles from the command line. Type "submit.sh -?" to see further information:

    [green@pc12 ~] ./submit.sh -?
    Usage: submit.sh <hw#>
    where <hw#> is a number in the range [1,5]

    Submits your project component by tagging the current version of
    your subversion repository as the submitted version.  It may be
    executed multiple times for the same <hw#>.  The most
    recently submitted version is the one that will be used for
    grading (and its timestamp will determine any late penalties).
    This script must be run from your subversion DavisDB directory.

    After submitting, the script will also run a test build of your
    project, by checking out the submitted version into a temporary
    directory and executing "cmake ." then "make".

    Note that since the submission procedure uses svn, if you are
    running submit from outside CSIF, it may prompt several times for
    your CSIF password.

    DavisDB%

Late Policy

The late policy is as follows. Please read carefully!

Policy on Collaboration

No source code may be shared across student teams. We will run automated tools to detect instances of source code copying, and deal harshly with any violations of this policy. Sharing of test code is permitted (and encouraged!), with the proviso that such sharing must be done via the class mailing list, so that everyone can benefit equally. Teams are also allowed (and encouraged!) to help each other with debugging.

DavisDB I/O Efficiency and Code Beauty Contests

After the first four project components have been completed, we will hold two contests, the DavisDB I/O Efficiency Contest and the DavisDB Code Beauty Contest, and give extra credit to the winners of these contests. More details will be posted soon.

Acknowledgments

Major aspects of the DavisDB project are derived from the RedBase project developed by Jennifer Widom for use in CS 346 at Stanford, and used here with her permission.