Improved Monte Carlo methods for estimating confidence intervals for eleven commonly used health disparity measures


Health disparities are commonplace and of broad interest to policy makers, but are also challenging to measure and communicate. The Health Disparity Calculator software (HDCalc, v1.2.4) offers Monte Carlo simulation (MCS)-based confidence interval (CI) estimation of eleven disparity measures. The MCS approach provides accurate CI estimation, except when data are scarce (e.g., rare cancers). To address sparse data challenges to CI estimation, we propose two solutions: 1) employing the gamma distribution in the MCS and 2) utilizing a zero-inflated Poisson estimate for Poisson sampling in simulation experiments. We evaluate each solution through simulation studies using female breast, female brain, lung, and cervical cancer data from the Surveillance, Epidemiology, and End Results (SEER) program. We compare the coverage probabilities (CPs) of eleven health disparity measures based on simulated datasets. The truncated normal distribution implemented in the MCS with the standard Poisson samples (the default setting of HDCalc) leads to less-than-optimal coverage probabilities (<95%). When both the gamma distribution and the estimated mean from the zero-inflated Poisson are used for the MCS, the coverage probabilities are close to the nominal level of 95%. Simulation studies also demonstrate that collapsing age categories for better CI estimation is not a pragmatic solution.

PLoS One
Sam Harper
Sam Harper
Associate Professor of Epidemiology

My research interests include impact evaluation, reproducible research, and social epidemiology.