Statistics 326/426, Multivariate Analysis and Data Mining,
Professor Jiayang Sun, Office: Yost 326, Phone:
368-0630, e-mail: jsun at case edu,
- Office Hrs: TTh 4-4:45pm (subject to change)
- Teaching Assistant:
Vinay Bhandaru, email: vxb18 at case edu
- Office Hrs: MW 11am-1pm @Yost 322, 216-368-1498
- Other TAs/Tutors - call stat dept 368-6941 to confirm
- Lab Administrator:
Office Hrs: 12-1 MTWTh, and 1-3 W (subject to change)
at Yost 335, Phone 368-0417
- e-mail: help at stat case edu or
stats-help at case edu
- Class: 2:45-4:00PM on TTH at Yost 101
- Course Description:
- Stat 326/426
introduces classical and modern techniques for modeling, analyzing and mining
multivariate data. The general outline is:
There will be case studies, labs and group discussions. The participation in group discussions is required. Most course information is accessible via my Web page:
- Graphical Methods for Multivariate Data
- One Multivariate Sample,
Multivariate ANOVA and Multivariate Regression
- Dimension Reduction Techniques:
principal components, correspondence analysis and projection pursuit
- Classification and Clustering:
multidimensional scaling, discriminant and cluster analysis,
and classification and regression trees (CART)
- Analysis of Covariance Structures/Latent Variable Models:
principle components (revisit), factor analysis and covariance
structure models (time permitting)
- Other Data Mining Techniques:
nearest neighbor, support vector machines, EM algorithms, boosting and bagging,
... (time permitting)
Splus (or R) and some SAS, Xgobi/Ggobi packages will be used. Read a
news report on R from New York Times
- Recommended Text:
- Other References (most are reserved in the Kevin-Smith library):
- Everitt, B. and Dunn, G. (2001), Applied Multivariate Data Analysis, Second Edition,
New York : Oxford University Press, ISBN: 0195209370 (1. Link in amazon.
2. Code and Data Sets. 3. Wiley's print-on-demand link)
- Everitt, Brian S. (2007), An R and S-Plus Companion to Multivariate Analysis,
Springer-Verlag, ISBN: 978-1-85233-882-4
- Hardle, W. and Simir, L. (2007), Applied Multivariate Statistical Analysis, Springer, ISBN: 978-3-540-72243-4
- Bishop, C.M. (2006), Pattern Recognition and Machine Learning,
- Hastie, T., Tibshirani, R. and Friedman J. (2009), The Elements of Statistical Learning: data mining, inference and prediction, Springer- Verlag, Second edition.
- Yasunori Fujikoshi, Vladimir V. Ulyanov, Ryoichi Shimizu (2010),
Multivariate Statistics : High-Dimensional and Large-Sample Approximations, Wiley.
- Breiman, Friedman, Olshen and Stone (1984),
Classification and regression trees,
Second Edition, The Wadsworth statistics/probability series, ISBN: 053498054
- Sun, J. (1998), Projection Pursuit, Encyclopedia of Statistical
Sciences (updated volumes), Vol. 2, pp 554-560, Wiley, (Edited by: Samuel Kotz, Campbell Read, David
Banks, and Norman Johnson.)
- Hair, Black, Babin, Anderson and Tatham (2010), Multivariate Data
Analysis, Seventh Edition, Prentice Hall, ISBN-10: 0-13-813263-1
- Some S/Splus/R, SAS and E-books.
- Final Examination: 12:30-3:30PM on May 3, 2012
- Grading Policy:
- The assessment is based on homework assignments (50%)
final examination (50%).
The assessment is based on
homework assignments (45%),
an oral presentation (10%)
and a final examination (45%).
A graduate student's presentation can be based on his/her ongoing research project to which some multivariate data analysis may be useful, or on an interesting topic/article approved by the instructor.
No late homework!
Minimum score to pass the course is 60, out of a 100 scale.