TaxI: a software tool for DNA barcoding using distance methods
Abstract
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding.
References
-
Altschul S.F, Gish W, Miller W, Myers E.W& Lipman D.J . 1990 Basic alignment search tool. J. Mol. Biol. 215, 403–410.doi:10.1006/jmbi.1990.9999. . Crossref, PubMed, Web of Science, Google Scholar -
Blaxter M . 2003 Counting angels with DNA. Nature. 421, 122–124.doi:10.1038/421122a. . Crossref, PubMed, Web of Science, Google Scholar -
Eckmann R& Rösch R . 1998 Lake Constance fisheries and fish ecoloy. Arch. Hydrobiol. Spec. Issues Advan. Limnol. 53, 285–301. Google Scholar -
Folmer O, Black M, Heah W, Lutz R& Vrijenhoek R . 1994 DNA primers for amplification of mitochondrial cytochrome C oxidase subunit I from diverse metazoan invertebrates. Mol. Mar. Biol. Biotechnol. 3, 294–299. PubMed, Google Scholar -
Gojobori T, Moriyama E.N, Ina Y, Ikeo K, Miura T, Tsujimoto H, Hayami M& Yokoyama S . 1990 Evolutionary origin of human and simian immunodeficiency viruses. Proc. Natl Acad. Sci. USA. 87, 4108–4111. Crossref, PubMed, Web of Science, Google Scholar -
Hebert P.D.N, Cywinska A, Ball S.L& deWaard J.R Biological identifications through DNA barcodes. Proc. R. Soc. B. 270, 2003a 313–321.doi:10.1098/rspb.2002.2218. . Link, Web of Science, Google Scholar -
Hebert P.D.N, Ratnasingham S& deWaard J.R Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proc. R. Soc. B. 270, Suppl. 1 2003b S96–S99.doi:10.1098/rsbl.2003.0025. . Link, Web of Science, Google Scholar -
Jukes T.H& Cantor C.R Evolution of protein molecules. Mammalian protein metabolism& Munro H.N . 1969pp. 21–132. Eds. New York:Academic Press. Crossref, Google Scholar -
Kimura M . 1980 A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111–120.doi:10.1007/BF01731581. . Crossref, PubMed, Web of Science, Google Scholar -
Köhler J, Vieites D.R, Bonett R.M, García F.H, Glaw F, Steinke D& Vences M . 2005 New amphibians and global conservation: A boost in species discoveries in a highly endangered vertebrate group. BioScience. 55, 693–696. Crossref, Web of Science, Google Scholar -
Lipscomb D, Platnick N& Wheeler Q . 2003 The intellectual content of taxonomy: a comment on DNA taxonomy. Trends Ecol. Evol. 18, 65–66.doi:10.1016/S0169-5347(02)00060-5. . Crossref, Web of Science, Google Scholar -
Lockhart P.J, Steel M.A, Hendy M.D& Penny D . 1994 Recovering evolutionary trees under a more realistic model of sequence evolution. Mol. Biol. Evol. 11, 605–612. PubMed, Web of Science, Google Scholar -
Mallett J& Willmott K . 2003 Taxonomy: renaissance of Tower of Babel?. Trends Ecol. Evol. 18, 57–59.doi:10.1016/S0169-5347(02)00061-7. . Crossref, Web of Science, Google Scholar -
Markmann M& Tautz D . 2005 Reverse taxonomy: an approach towards determining the diversity of meiobenthic organisms based on ribosomal RNA signature sequences. Phil. Trans. R. Soc. B. 360, 1917–1924.doi:10.1098/rstb.2005.1723. . Link, Web of Science, Google Scholar -
Medina M& Walsh P.J . 2000 Molecular systematics of the order Anaspidea based on mitochondrial DNA sequence (12S, 16S, and COI). Mol. Phylogenet. Evol. 15, 41–58.doi:10.1006/mpev.1999.0736. . Crossref, PubMed, Web of Science, Google Scholar -
Moritz C& Cicero C . 2004 DNA barcoding: promise and pitfalls. PLoS Biol. 2, 1529–1531.doi:10.1371/journal.pbio.0020354. . Crossref, Web of Science, Google Scholar -
Morrison D.A& Ellis J.T . 1997 Effects of nucleotide sequence alignment on phylogeny estimation: A case study of 18S rDNAs of apicomplexa. Mol. Biol. Evol. 14, 428–441. Crossref, PubMed, Web of Science, Google Scholar -
Notredame C, Higgins D.G& Heringa J . 2000 T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217.doi:10.1006/jmbi.2000.4042. . Crossref, PubMed, Web of Science, Google Scholar -
Palumbi S.R, Martin A, Romano S, McMillian W.O, Stine L& Grabowski G The simple fools guide to PCR. v.2.0 1991 Honolulu:Department Zoology, Kewalo Marine Laboratory, University of Hawaii. Google Scholar -
Pearson W.R& Lipman D.J . 1988 Improved tools for biological sequence comparison. Proc. Natl Acad. Sci. USA. 85, 2444–2448. Crossref, PubMed, Web of Science, Google Scholar -
Rosello-Mora R& Amann R . 2001 The species concept for prokaryotes. FEMS Microbiol. Rev. 25, 39–67.doi:10.1016/S0168-6445(00)00040-1. . Crossref, PubMed, Web of Science, Google Scholar -
Steinke D, Albrecht C& Pfenninger M . 2004 Molecular phylogeny and character evolution in the Western Palaearctic Helicidae s.l. (Gastropoda: Stylommatophora). Mol. Phylogenet. Evol. 32, 724–734.doi:10.1016/j.ympev.2004.03.004. . Crossref, PubMed, Web of Science, Google Scholar -
Tajima F& Nei M . 1984 Estimation of evolutionary distance between nucleotide sequences. Mol. Biol. Evol. 1, 269–285. PubMed, Web of Science, Google Scholar -
Tamura K . 1992 Estimation of the number of nucleotide substitutions when there are strong transition–transversion and G+C-content biases. Mol. Biol. Evol. 9, 678–687. PubMed, Web of Science, Google Scholar -
Tamura K& Nei M . 1993 Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10, 512–526. PubMed, Web of Science, Google Scholar -
Tautz D, Arctander P, Minelli A, Thomas R.H& Vogler A.P . 2003 A plea for DNA taxonomy. Trends Ecol. Evol. 18, 70–74.doi:10.1016/S0169-5347(02)00041-1. . Crossref, Web of Science, Google Scholar -
Thomaz D, Guiller A& Clarke B . 1996 Extreme divergence of mitochondrial DNA within species of pulmonate land snails. Proc. R. Soc. B. 263, 363–368. Link, Web of Science, Google Scholar -
Wheeler W.C Sources of ambiguity in nucleic acid sequence alignment. Molecular ecology and evolution: approaches and applications, Schierwater B, Streit B, Wagner G.P& DeSalle R . 1994pp. 323–352. Eds. Basel:Birkhauser Verlag. Google Scholar -
Vences M, Thomas M, Bonett R.M& Vieites D.R Deciphering amphibian diversity through DNA barcoding: chances and challenges. Phil. Trans. R. Soc. B. 360, 2005a 1859–1868.doi:10.1098/rstb.2005.1717. . Link, Web of Science, Google Scholar -
Vences M, Thomas M, van der Meijden A, Chiari Y& Vieites D.R Comparative performance of the 16S rRNA in DNA barcoding of amphibians. Front. Zool. 2, 2005b 5 doi:10.1186/1742-9994-2-5. . Crossref, PubMed, Google Scholar