ABSTRACT
With the advance of next generation sequencing technologies, researchers now routinely obtain a collection of microbial sequences with complex phylogenetic relationships. It is often of interest to analyze the association between certain environmental factors and characteristics of the microbial collection. Though methods have been developed to test for association between the microbial composition with environmental factors as well as between coevolving traits, a flexible model that can provide a comprehensive picture of the relationship between microbial community characteristics and environmental variables will be tremendously beneficial. We developed a Bayesian approach for association analysis while incorporating the phylogenetic structure to account for the dependence between observations. To overcome the computational difficulty related to the phylogenetic tree, a variational algorithm was developed to evaluate the posterior distribution. As the posterior distribution can be readily obtained for parameters of interest and any derived variables, the association relationship can be examined comprehensively. With two application examples, we demonstrated that the Bayesian approach can uncover nuanced details of the microbial assemblage with regard to the environmental factor. The proposed Bayesian approach and variational algorithm can be extended for other problems involving dependence over tree-like structures.
Acknowledgments
We want to thank Drs. Julia Soulakova, Shunpu Zhang, George Graf, Jing Han, Annie Lumen, and anonymous reviewers for valuable comments. The opinion expressed in this paper is that of the authors. It does not necessarily reflect the position of US Food and Drug Administration.
Disclosure statement
No potential conflict of interest was reported by the author(s).