Abstract
We present the open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors (https://github.com/benjjneb/dada2). DADA2 infers sample sequences exactly and resolves differences of as little as 1 nucleotide. In several mock communities, DADA2 identified more real variants and output fewer spurious sequences than other methods. We applied DADA2 to vaginal samples from a cohort of pregnant women, revealing a diversity of previously undetected Lactobacillus crispatus variants.
Publication types
- Research Support, N.I.H., Extramural
- Research Support, Non-U.S. Gov't
- Research Support, U.S. Gov't, Non-P.H.S.
MeSH terms
- Animals
- Cohort Studies
- Computational Biology / methods*
- DNA, Bacterial / genetics
- False Positive Reactions
- Feces / microbiology
- Female
- High-Throughput Nucleotide Sequencing / methods*
- Humans
- Lactobacillus / classification
- Lactobacillus / genetics
- Lactobacillus / isolation & purification*
- Mice
- Microbiota / genetics*
- Pregnancy
- RNA, Ribosomal, 16S / genetics
- Reproducibility of Results
- Sequence Analysis, DNA / methods*
- Software*
- Vagina / microbiology
Substances
- DNA, Bacterial
- RNA, Ribosomal, 16S