Friday, November 2, at 327 Yost
Freshments: 3:30-4:00 p.m,
Talk: 4:00 - 5:00 p.m.
Statistical analysis of data typically consists of two stages: (1) the
building of a model for the data (2) formal, mathematical-probabilistic
inferences conditional on the model. In stage (1), we make model
inferences: specifications of systematic and haphazard variation in the
data. Model building is complex because it requires combining information
from two sources: (1) external --- information from sources external to the
data such as subject matter theory and other sets of data, and (2) data ---
information that arises from studying the data. A vast array of methods of
analysis contribute in practice to the model building process: chi^2-tests,
normal quantile plots, regression residual analysis, etc. Often, the model
building phase of a data analysis is the salient part of the analysis and
the mathematical-probabilistic phase is routine. But the situation is
reversed for the theory of statistics. A vast amount of theory exists for
stage (2); in fact, there are not one but two distinct paradigms, Bayesian
and frequentist, and each governs the whole course that theory takes and
results in one are not directly translatable to the other. By contrast
there is almost no theory that governs model specification inference. We
propose a theory for the model building phase of data analysis. The theory
deals with the combination of external and data information in carrying out
model inferences. One key aspect invokes the Savage's principle of stable
estimation. The theory provides a mechanism for assessing model building
methods; for example, the theory explains why visualization methods are
such powerful tools for model building.