Abstract
Many environments within the human body host a collection of micro-organisms called microbiota. Recent findings have linked the composition of the microbiota to the development of different human diseases, including cancer. Motivated by a recent colorectal cancer (crc) study, we investigate the effect of clinical factors and diet-related covariates on the microbiota compositions; for the patients enrolled in this study, microbiota abundance counts are collected from three different districts, namely, tumor, fecal and salivary samples. Building upon the Dirichlet-multinomial regression framework, we develop a high-dimensional Bayesian hierarchical model that exploits subject-specific regression coefficients to simultaneously borrow strength across districts and include complex interactions between diet and clinical factors if supported by the data. The proposed method identifies relevant associations through model selection priors and thresholding mechanisms. Posterior inference is performed via a Markov chain Monte Carlo algorithm. We use simulation studies to assess the performance of our method, and found our approach to outperform competing methods that do not account for complex interactions. Finally, a thorough analysis of the crc data illustrates the benefits of the proposed approach.
Funding Statement
The first and the third author were supported by the “Dipartimenti Eccellenti 2018–2022” ministerial funds (Italy).
Acknowledgments
The authors would like to thank Jun Chen for making the R code implementing Chen and Li’s (2013) method available to them. The authors would like to thank the anonymous referees, an Associate Editor and the Editor for their constructive comments that improved the quality of this paper.
Citation
Matteo Pedone. Amedeo Amedei. Francesco C. Stingo. "Subject-specific Dirichlet-multinomial regression for multi-district microbiota data analysis." Ann. Appl. Stat. 17 (1) 539 - 559, March 2023. https://doi.org/10.1214/22-AOAS1641