Data mining is the process of automatically discovering useful information from large data sets or databases. This course will provide an introduction to the main topics and algorithms in data mining and knowledge discovery, including: association discovery, classification, clustering, outlier detection, database support, and so on. Emphasis will be placed on the algorithmic and systems issues, as well as application of mining in real-world problems. Students will have to solve some small written and programming assignments that will help them to understand and digest the covered material.

**Instructor**

Prof. George Kollios, gkollios@cs.bu.edu

Office: MCS 288

Office Hours: Monday 2:30 pm - 4:00 pm and Tuesday 10:25 am - 11:55 am, or
by appointment.

Phone: 617-358-1835

http://www.cs.bu.edu/fac/gkollios

**Workload**

- Three programming projects
- Three problem sets
- Two exams; one midterm and one final

Working knowledge of programming and data structures (CS 112, or equivalent). Familiarity with linear algebra, probability and statistics.

**Textbook**

- Jiawei Han and Micheline Kamber:
Data Mining: Concepts and Techniques.
Second Edition. Morgan Kaufmann Publishers, March 2006.

MW 4:00pm-5:30pm in MCS B33

**Exams**

**Midterm: October 29, 2007, in class.**

**Final: December 17, 2007, 4:00 - 6:00 pm**