Small Area Inference for Binary Variables

in the National Health Interview Survey

JOSEPH SEDRANSK

Statistics, Case Western Reserve University

The National Health Interview Survey is designed to produce precise estimates of finite population parameters for the entire United States but not for small geographical areas or subpopulations. Our investigation concerns estimates of proportions such as the probability of at least one visit to a doctor within the past twelve months. To include all sources of variation in the model, we carry out a Bayesian hierarchical analysis for the desired finite population quantities. First, for each cluster (county) a separate logistic regression relates the individual's probability of a doctor visit to his or her characteristics. Second, a multivariate linear regression links cluster regression parameters to covariates measured at the cluster level.

We describe the numerical methods needed to obtain the desired posterior moments. Then we compare estimates produced using the exact numerical method with approximations. Finally, we compare the hierarchical Bayes estimates with empirical Bayes estimates and with standard methods, that is, synthetic estimates and estimates obtained from a conventional randomization-based approach. We use a cross-validation exercise to assess the quality of model fit. We also summarize the results of a separate study of the binary indicator of partial work limitation. Because we know the value of this variable for each respondent to the 1990 Census long form, we can compare estimates corresponding to alternative methods and models with very accurate estimates of the true values.


Refreshments: 3:30 - 4:00 p.m. Friday, at 327 Yost
Talk: 4:00 - 5:00 p.m. Friday, at 327 Yost.

Questions? jiayang@sun.cwru.edu
Wed Aug 13 13:54:29 EDT 1997