We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis.
- Authors
Tang, Zheng-Zheng; Chen, Guanhua
- Abstract
There is heightened interest in using high-throughput sequencing technologies to quantify abundances of microbial taxa and linking the abundance to human diseases and traits. Proper modeling of multivariate taxon counts is essential to the power of detecting this association. Existing models are limited in handling excessive zero observations in taxon counts and in flexibly accommodating complex correlation structures and dispersion patterns among taxa. In this article, we develop a new probability distribution, zero-inflated generalized Dirichlet multinomial (ZIGDM), that overcomes these limitations in modeling multivariate taxon counts. Based on this distribution, we propose a ZIGDM regression model to link microbial abundances to covariates (e.g. disease status) and develop a fast expectation-maximization algorithm to efficiently estimate parameters in the model. The derived tests enable us to reveal rich patterns of variation in microbial compositions including differential mean and dispersion. The advantages of the proposed methods are demonstrated through simulation studies and an analysis of a gut microbiome dataset.
- Subjects
REGRESSION analysis; DATA analysis; GUT microbiome; EXPECTATION-maximization algorithms; MICROSIMULATION modeling (Statistics)
- Publication
Biostatistics, 2019, Vol 20, Issue 4, p698
- ISSN
1465-4644
- Publication type
journal article
- DOI
10.1093/biostatistics/kxy025