Advanced Database Seminar:
Indexing Methods in Databases
Class Meetings:
Monday 3:00 - 6:00 in MUG 205
Instructor:
George
Kollios
Room: MCS 288
E-mail: gkollios@cs.bu.edu
URL: http://www.cs.bu.edu/fac/gkollios/web1.html
Office Hours: Thursday 2:00-5:00 or by appointment
R-Tree Implementations
Spatial, Temporal and High-Dimensional
Datasets
Suggested Topics for Projects
Course Description:
The objective of this course is to present and discuss indexing
methods for modern database systems. An indexing scheme is a method to
organize data in external memory (hard disks) and efficiently
answer a given type of queries. The design of a good index structure
depends on the sort of objects that need to be indexed and the set
of queries that we want to execute efficiently. Therefore, different type
of objects and queries require different index structures.
In this seminar, we will discuss research papers from the following topics:
-
Spatial Databases: Indexing 2- and 3-dimensional objects.
-
Temporal Databases: Here objects have also a temporal
component.
-
Spatiotemporal Databases: Indexing spatial objects
that change over time.
-
Image and Multimedia Databases (High Dimensional Indexing):
Indexing large collections of images and videos for efficient similarity
retrieval.
-
Web Databases: Indexing support for Web databases.
Also, we will discuss papers in the following topics:
Analysis and Index Benchmarking: Modeling and predicting
the performance of index methods under various workloads.
Theory of Indexability: Some recent results on theoretical
approaches to study the indexing problem.
At the end of the course we may also discuss some recent
advances in data mining.
Prerequisites:
Algorithms and data structures knowledge. Good programming
skills in C or C++. A database background is a plus.
Note that this course is geared toward PhD and Masters
students. Please see the instructor if you are an undergraduate student
or you are not sure about your level of preparation.
Textbooks (Not Required):
The following textbooks are not required but they can serve
as reference to provide some background for the problems discussed
in the papers. Most of these books should be available on reserve at the
library.
R. Ramakrishnan and J. Gehrke. Database
Management Systems. Second Edition. WCB/McGraw-Hill
1998.
A. Silberschatz, H. Korth and S. Sudarshan. Database
System Concepts. Third Edition. McGraw-Hill 1998.
P. O'Neil, E. O'Neil. Database:
Principles, Programming, and Performance. Second Edition.
Morgan Kaufmann 2000.
An excellent collection of classic papers in the database
field is the following:
Readings in
Database Systems. M. Stonebraker and J. M. Hellerstein, eds.
Morgan-Kaufmann, 1998.
Other books and manuscripts that discuss indexing methods
in more detail are:
H. Samet. The
Design and Analysis of Spatial Data Structures . Addison-Wesley, Reading,
MA, 1990.
C. Faloutsos. Searching
Multimedia Databases by Content. Kluwer Academic Publishers,
1996.
Y. Manolopoulos, Y. Theodoridis and V. Tsotras. Advanced
Database Indexing. Kluwer Academic Publishers, 1999.
Course Requirements:
All students will take turns presenting papers. There will
be two to four papers presented in each class and each student will give
one or two presentation during the course. A student preparing a paper
presentation is also responsible for making up two questions for the paper.
Students are required to carefully read the papers before class.
Also you have to write three short reviews (at least one page each review)
and propose at least an idea for an improvement over the ideas presented
in the paper.
There will be 2 in-class quizzes testing the main concepts
in the papers discussed in class.
Finally, this seminar requires a semester-long project
and a final report in the style of a conference paper. The project
can be implementation oriented (implementation and experimental comparison
of a collection of index methods), or can be the development of a novel
algorithm.
The overall grade will be based on the following policy:
-
Paper presentation and class participation: 40%
-
Two in-class quizzes: 20%
-
Project: 40%
Interesting Links:
VLDB Endowment
ACM SIGMOD
DBPL
Bibliography