Small Area Inference for Binary Variables
in the National Health
Interview Survey
JOSEPH SEDRANSK
Statistics, Case Western Reserve University
The National Health Interview Survey is designed to produce precise
estimates of finite population parameters for the entire United States
but not for small geographical areas or subpopulations. Our
investigation concerns estimates of proportions such as the
probability of at least one visit to a doctor within the past twelve
months. To include all sources of variation in the model, we carry
out a Bayesian hierarchical analysis for the desired finite population
quantities. First, for each cluster (county) a separate logistic
regression relates the individual's probability of a doctor visit to
his or her characteristics. Second, a multivariate linear regression
links cluster regression parameters to covariates measured at the
cluster level.
We describe the numerical methods needed to obtain the desired posterior
moments. Then we compare estimates produced using the exact numerical
method with approximations. Finally, we compare the hierarchical Bayes
estimates with empirical Bayes estimates and with standard methods, that
is, synthetic estimates and estimates obtained from a conventional
randomization-based approach. We use a cross-validation exercise to assess
the quality of model fit. We also summarize the results of a separate
study of the binary indicator of partial work limitation. Because we know
the value of this variable for each respondent to the 1990 Census long
form, we can compare estimates corresponding to alternative methods and
models with very accurate estimates of the true values.
Refreshments: 3:30 - 4:00 p.m. Friday, at 327 Yost
Talk: 4:00 - 5:00 p.m. Friday, at 327 Yost.
Questions? jiayang@sun.cwru.edu
Wed Aug 13 13:54:29 EDT 1997