Time : Series of three lectures. At 327 Yost Hall
Tuesday, Feb. 27, 4:00-5:00.
Thursday, Mar. 1, 11:30-12:30
Friday, Mar 2, 4:00-5:00
The topics of Data Mining and Knowledge Discovery in Databases has
gained a lot of prominence with automated methods of data-collection
in this information age. Broadly speaking, data mining is the
extraction of useful information from large amounts of data, often
collected without any pre-defined purpose in mind. Most of the
present-day applications are commercial even though other areas
exist. Some examples including discerning customer preferences based
on transactions data for better store layout as well as targeted
advertising, clustering software metrics databases to develop
automated techniques for determining procedures that need to be
upgraded together, deciding of related interest to a person who has
entered the query "car" in a search engine as well as scheduling
classes to minimize commuting students' discomfort. Algorithms used in
data mining are both data- as well as computer-intensive. Because the
underlying database is observational in nature, statistical techniques
play a natural role. The lecture series will focus on basic needs of
data mining, available statistical methodology as well as areas
requiring further attention. Applications will be highlighted
throughout the series.
The basic outline for the three-part series is: