TY - JOUR

T1 - Parametric Mixture Models for Estimating the Proportion of True Null Hypotheses and Adaptive Control of FDR

AU - Tamhane, Ajit C

AU - Shi, Jiaxiao

N1 - https://www.jstor.org/stable/30250047

PY - 2009

Y1 - 2009

N2 - Estimation of the proportion or the number of true null hypotheses is an important problem in multiple testing, especially when the number of hypotheses is large. Wu, Guan and Zhao [Biometrics 62 (2006) 735-744] found that nonparametric approaches are too conservative. We study two parametric mixture models (normal and beta) for the distributions of the test statistics or their p-values to address this problem. The components of the mixture are the null and alternative distributions with mixing proportions $\pi_{o}$ and 1 - $\pi_{o}$, respectively, where $\pi_{o}$ is the unknown proportion to be estimated. The normal model assumes that the test statistics from the true null hypotheses are i.i.d. N(0, 1) while those from the alternative hypotheses are i.i.d. N(δ, 1) with δ≠ 0. The beta model assumes that the p-values from the null hypotheses are i.i.d. U[0, 1] and those from the alternative hypotheses are i.i.d. Beta(a, b) with a < 1 < b. All parameters are assumed to be unknown. Three methods of estimation of $\pi$7ro are developed for each model. The methods are compared via simulation with each other and with Storey's [J. Roy. Statist. Soc. Ser. B 64 (2002) 297-304] nonparametric method in terms of the bias and mean square error of the estimators of $\pi$ and the achieved FDR. Robustness of the estimators to the model violations is also studied by generating data from other models. For the normal model, the parametric methods perform better compared to Storey's method with the EM method (Dempster, Laird and Rubin [Roy. Statist. Soc. Ser. B 39 (1977) 1-38]) performing best overall when the assumed model holds; however, it is not very robust to significant model violations. For the beta model, the parametric methods do not perform as well because of the difficulties of estimation of parameters, and Storey's nonparametric method turns out to be the winner in many cases. Therefore the beta model is not recommended for use in practice. An example is given to illustrate the methods.

AB - Estimation of the proportion or the number of true null hypotheses is an important problem in multiple testing, especially when the number of hypotheses is large. Wu, Guan and Zhao [Biometrics 62 (2006) 735-744] found that nonparametric approaches are too conservative. We study two parametric mixture models (normal and beta) for the distributions of the test statistics or their p-values to address this problem. The components of the mixture are the null and alternative distributions with mixing proportions $\pi_{o}$ and 1 - $\pi_{o}$, respectively, where $\pi_{o}$ is the unknown proportion to be estimated. The normal model assumes that the test statistics from the true null hypotheses are i.i.d. N(0, 1) while those from the alternative hypotheses are i.i.d. N(δ, 1) with δ≠ 0. The beta model assumes that the p-values from the null hypotheses are i.i.d. U[0, 1] and those from the alternative hypotheses are i.i.d. Beta(a, b) with a < 1 < b. All parameters are assumed to be unknown. Three methods of estimation of $\pi$7ro are developed for each model. The methods are compared via simulation with each other and with Storey's [J. Roy. Statist. Soc. Ser. B 64 (2002) 297-304] nonparametric method in terms of the bias and mean square error of the estimators of $\pi$ and the achieved FDR. Robustness of the estimators to the model violations is also studied by generating data from other models. For the normal model, the parametric methods perform better compared to Storey's method with the EM method (Dempster, Laird and Rubin [Roy. Statist. Soc. Ser. B 39 (1977) 1-38]) performing best overall when the assumed model holds; however, it is not very robust to significant model violations. For the beta model, the parametric methods do not perform as well because of the difficulties of estimation of parameters, and Storey's nonparametric method turns out to be the winner in many cases. Therefore the beta model is not recommended for use in practice. An example is given to illustrate the methods.

M3 - Article

VL - 57

SP - 304

EP - 325

JO - Lecture Notes-Monograph Series

JF - Lecture Notes-Monograph Series

SN - 0749-2170

ER -