Statistical power of phylo-HMM for evolutionarily conserved element detection

BMC Bioinformatics. 2007 Oct 5:8:374. doi: 10.1186/1471-2105-8-374.

Abstract

Background: An important goal of comparative genomics is the identification of functional elements through conservation analysis. Phylo-HMM was recently introduced to detect conserved elements based on multiple genome alignments, but the method has not been rigorously evaluated.

Results: We report here a simulation study to investigate the power of phylo-HMM. We show that the power of the phylo-HMM approach depends on many factors, the most important being the number of species-specific genomes used and evolutionary distances between pairs of species. This finding is consistent with results reported by other groups for simpler comparative genomics models. In addition, the conservation ratio of conserved elements and the expected length of the conserved elements are also major factors. In contrast, the influence of the topology and the nucleotide substitution model are relatively minor factors.

Conclusion: Our results provide for general guidelines on how to select the number of genomes and their evolutionary distance in comparative genomics studies, as well as the level of power we can expect under different parameter settings.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Base Sequence
  • Chromosome Mapping / methods*
  • Conserved Sequence / genetics*
  • Data Interpretation, Statistical
  • Evolution, Molecular*
  • Markov Chains
  • Molecular Sequence Data
  • Phylogeny
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*