How to track and assess genotyping errors in population genetics studies
Corresponding Author
A. BONIN
Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 09, France,
A. Bonin. Fax: +33 0 4 76 51 42 79; E-mail: [email protected]Search for more papers by this authorE. BELLEMAIN
Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 09, France,
Department of Ecology and Natural Resource Management, Agricultural University of Norway, Box 5003, NO-1432 Ås, Norway,
Search for more papers by this authorP. BRONKEN EIDESEN
National Centre for Biosystematics, Natural History Museums and Botanical Garden, University of Oslo, PO Box 1172 Blindern, NO-0318 Oslo, Norway
Search for more papers by this authorF. POMPANON
Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 09, France,
Search for more papers by this authorC. BROCHMANN
National Centre for Biosystematics, Natural History Museums and Botanical Garden, University of Oslo, PO Box 1172 Blindern, NO-0318 Oslo, Norway
Search for more papers by this authorP. TABERLET
Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 09, France,
Search for more papers by this authorCorresponding Author
A. BONIN
Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 09, France,
A. Bonin. Fax: +33 0 4 76 51 42 79; E-mail: [email protected]Search for more papers by this authorE. BELLEMAIN
Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 09, France,
Department of Ecology and Natural Resource Management, Agricultural University of Norway, Box 5003, NO-1432 Ås, Norway,
Search for more papers by this authorP. BRONKEN EIDESEN
National Centre for Biosystematics, Natural History Museums and Botanical Garden, University of Oslo, PO Box 1172 Blindern, NO-0318 Oslo, Norway
Search for more papers by this authorF. POMPANON
Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 09, France,
Search for more papers by this authorC. BROCHMANN
National Centre for Biosystematics, Natural History Museums and Botanical Garden, University of Oslo, PO Box 1172 Blindern, NO-0318 Oslo, Norway
Search for more papers by this authorP. TABERLET
Laboratoire d’Ecologie Alpine, CNRS-UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 09, France,
Search for more papers by this authorAbstract
Genotyping errors occur when the genotype determined after molecular analysis does not correspond to the real genotype of the individual under consideration. Virtually every genetic data set includes some erroneous genotypes, but genotyping errors remain a taboo subject in population genetics, even though they might greatly bias the final conclusions, especially for studies based on individual identification. Here, we consider four case studies representing a large variety of population genetics investigations differing in their sampling strategies (noninvasive or traditional), in the type of organism studied (plant or animal) and the molecular markers used [microsatellites or amplified fragment length polymorphisms (AFLPs)]. In these data sets, the estimated genotyping error rate ranges from 0.8% for microsatellite loci from bear tissues to 2.6% for AFLP loci from dwarf birch leaves. Main sources of errors were allelic dropouts for microsatellites and differences in peak intensities for AFLPs, but in both cases human factors were non-negligible error generators. Therefore, tracking genotyping errors and identifying their causes are necessary to clean up the data sets and validate the final results according to the precision required. In addition, we propose the outline of a protocol designed to limit and quantify genotyping errors at each step of the genotyping process. In particular, we recommend (i) several efficient precautions to prevent contaminations and technical artefacts; (ii) systematic use of blind samples and automation; (iii) experience and rigor for laboratory work and scoring; and (iv) systematic reporting of the error rate in population genetics studies.
References
- Ajmone-Marsan P, Negrini R, Crepaldi P et al. (2001) Assessing genetic diversity in Italian goat populations using AFLP markers. Animal Genetics, 32, 281–288.
- Ajmone-Marsan P, Valentini A, Cassandro M et al. (1997) AFLP markers for DNA fingerprinting in cattle. Animal Genetics, 28, 418–426.
- Akey JM, Zhang K, Xiong M, Doris P, Jin L (2001) The effect that genotyping errors have on the robustness of common linkage-disequilibrium measures. American Journal of Human Genetics, 68, 1447–1456.
- Akey JM, Zhang G, Zhang K, Jin L, Shriver MD (2002) Interrogating a high-density SNP map for signatures of natural selection. Genome Research, 12, 1805–1814.
- Bagley MJ, Anderson SL, May B (2001) Choice of methodology for assessing genetic impacts of environmental stressors: polymorphism and reproducibility of RAPD and AFLP fingerprints. Ecotoxicology, 10, 239–244.
- Beaumont MA, Nichols RA (1996) Evaluating loci for use in the genetic analysis of population structure. Proceedings of the Royal Society of London, 263, 1619–1626.
10.1098/rspb.1996.0237 Google Scholar
- Bellemain E, Swenson JE, Tallmon DA, Brunberg S, Taberlet P (2004) Estimating population size of elusive animals using DNA from hunter-collected feces: comparing four methods for brown bears. Conservation Biology, in press.
- Bellemain E, Taberlet P (2004) Improved non invasive genotyping method: application to brown bear (Ursus arctos) faeces. Molecular Ecology Notes, 4, 519–522.
- Benham J, Jeung JU, Jasieniuk M, Kanazin V, Blake T (1999) Genographer: a graphical tool for automated fluorescent AFLP and microsatellite analysis. Journal of Agricultural Genomics, 4, 399.
- Bradley BJ, Vigilant L (2002) False alleles derived from microbial DNA pose a potential source of error in microsatellite genotyping of DNA from faeces. Molecular Ecology Notes, 2, 602–605.
- Buetow KH (1991) Influence of aberrant observations on high-resolution linkage analysis outcomes. American Journal of Human Genetics, 49, 985–994.
- Cercueil A, Bellemain E, Manel S (2002) parente: computer program for parentage analysis. Journal of Heredity, 93, 458–459.
- Constable JL, Ashley MV, Goodall J, Pusey AE (2001) Noninvasive paternity assignment in Gombe chimpanzees. Molecular Ecology, 10, 1279–1300.
- Creel S, Spong G, Sands JL et al. (2003) Population size estimation in Yellowstone wolves with error-prone noninvasive microsatellite genotypes. Molecular Ecology, 12, 2003–2009.
- Davison A, Chiba S (2003) Laboratory temperature variation is a previously unrecognized source of genotyping error during capillary electrophoresis. Molecular Ecology Notes, 3, 321–323.
- Delmotte F, Leterme N, Simon JC (2001) Microsatellite allele sizing: difference between automated capillary electrophoresis and manual technique. Biotechniques, 31, 810, 814–816, 818.
- Douglas JA, Skol AD, Boehnke M (2002) Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data. American Journal of Human Genetics, 70, 487–495.
- Duchesne P, Godbout MH, Bernatchez L (2002) papa (package for the analysis of parental allocation): a computer program for simulated and real parental allocation. Molecular Ecology Notes, 2, 191–193.
- Dyer AT, Leonard KJ (2000) Contamination, error, and nonspecific molecular tools. Phytopathology, 90, 565–567.
- Ewen KR, Bahlo M, Treloar SA et al. (2000) Identification and analysis of error types in high-throughput genotyping. American Journal of Human Genetics, 67, 727–736.
- Fernando P, Evans BJ, Morales JC, Melnick DJ (2001) Electrophoresis artifacts — a previously unrecognized cause of error in microsatellite analysis. Molecular Ecology Notes, 1, 325–328.
- Gagneux P, Woodruff DS, Boesch C (1997a) Furtive mating in female chimpanzees. Nature, 387, 358–359.
- Gagneux P, Boesch C, Woodruff DS (1997b) Microsatellite scoring errors associated with noninvasive genotyping based on nuclear DNA amplified from shed hair. Molecular Ecology, 6, 861–868.
- Gaudeul M, Taberlet P, Till-Bottraud I (2000) Genetic diversity in an endangered alpine plant, Eryngium alpinum L. (Apiaceae), inferred from amplified fragment length polymorphism markers. Molecular Ecology, 9, 1625–1637.
- Gomes I, Collins A, Lonjou C et al. (1999) Hardy–Weinberg quality control. Annals of Human Genetics, 63, 535–538.
- Goossens B, Waits LP, Taberlet P (1998) Plucked hair samples as a source of DNA: reliability of dinucleotide microsatellite genotyping. Molecular Ecology, 7, 1237–1241.
- Gordon D, Finch SJ, Nothnagel M, Ott J (2002) Power and sample size calculations for case–control genetic association tests when errors are present: application to single nucleotide polymorphisms. Human Heredity, 54, 22–33.
- Hackett CA, Broadfoot LB (2003) Effects of genotyping errors, missing values and segregation distortion in molecular marker data on the construction of linkage maps. Heredity, 90, 33–38.
- Hansen M, Kraft T, Christiansson M, Nilsson NO (1999) Evaluation of AFLP in Beta. Theoretical and Applied Genetics, 98, 845–852.
- Hofreiter M, Serre D, Poinar HN, Kuch M, Paabo S (2001) Ancient DNA. Nature Reviews Genetics, 2, 353–359.
- Jeffery KJ, Keller LF, Arcese P, Bruford MW (2001) The development of microsatellite loci in the song sparrow, Melospiza melodia (Aves), and genotyping errors associated with good quality DNA. Molecular Ecology Notes, 1, 11–13.
- Jones CJ, Edwards KJ, Castaglione S et al. (1997) Reproducibility testing of RAPD, AFLP and SSR markers in plants by a network of European laboratories. Molecular Breeding, 3, 381–390.
- Kauer M, Dieringer D, Schlotterer C (2003) A microsatellite variability screen for positive selection associated with the ‘Out of Africa’ habitat expansion of Drosophila melanogaster. Genetics, 165, 1137–1148.
- Kennedy GC, Matsuzaki H, Dong S et al. (2003) Large-scale genotyping of complex DNA. Nature Biotechnology, 21, 1233–1237.
- Koonjul PK, Brandt WF, Farrant JM, Lindsey GG (1999) Inclusion of polyvinylpyrrolidone in the polymerase chain reaction reverses the inhibitory effects of polyphenolic contamination of RNA. Nucleic Acids Research, 27, 915–916.
- Lincoln SE, Lander ES (1992) Systematic detection of errors in genetic linkage data. Genomics, 14, 604–610.
- Matthes MC, Daly A, Edwards KJ (1998) Amplified length polymorphism (AFLP). In: Molecular Tools for Screening Biodiversity: Plants and Animals (eds A Karp, PG Isaac, D Ingram S ), pp. 183–192. Chapman & Hall, London.
10.1007/978-94-009-0019-6_36 Google Scholar
- McKelvey KS, Schwartz MK (2004) Genetic errors associated with population estimation using non-invasive molecular tagging: problems and new solutions. Journal of Wildlife Management, 68, 439–448.
- Miller CR, Joyce P, Waits LP (2002) Assessing allelic dropout and genotype reliability using maximum likelihood. Genetics, 160, 357–366.
- Mitchell AA, Cutler DJ, Chakravarti A (2003) Undetected genotyping errors cause apparent overtransmission of common alleles in the transmission/disequilibrium test. American Journal of Human Genetics, 72, 598–610.
- Mowat G, Paetkau D (2002) Estimating marten Martes americana population size using hair capture and genetic tagging. Wildlife Biology, 8, 201–209.
- O'Hanlon PC, Peakall R (2000) A simple method for the detection of size homoplasy among amplified fragment length polymorphism fragments. Molecular Ecology, 9, 815–816.
- Paetkau D (2003) An empirical exploration of data quality in DNA-based population inventories. Molecular Ecology, 12, 1375–1387.
- Paetkau D, Calvert W, Stirling I, Strobeck C (1995) Microsatellite analysis structure of population structure in Canadian polar bears. Molecular Ecology, 4, 347–354.
- Paetkau D, Strobeck C (1994) Microsatellite analysis of genetic variation in black bear populations. Molecular Ecology, 3, 489–495.
- Papa R, Troggio M, Ajmone-Marsan P, Nonnis Marzano F (2004) An improved protocol for the production of AFLP markers in complex genomes by means of capillary electrophoresis. Journal of Animal Breeding and Genetics, in press.
- Pigott M, Bellemain E, Taberlet P, Taylor A (2004) A multiplex pre-amplification method that significantly improves microsatellite amplification and error rates for faecal DNA in limiting conditions. Conservation Genetics, 5, 417–420.
- Polisky B, Greene P, Garfin DE et al. (1975) Specificity of substrate recognition by the EcoRI restriction endonuclease. Proceedings of the National Academy of Sciences USA, 72, 3310–3314.
- Rodriguez S, Visedo G, Zapata C (2001) Detection of errors in dinucleotide repeat typing by nondenaturing electrophoresis. Electrophoresis, 22, 2656–2664.
- Savelkoul PH, Aarts HJ, De Haas J et al. (1999) Amplified-fragment length polymorphism analysis: the state of an art. Journal of Clinical Microbiology, 37, 3083–3091.
- Segovia-Lerma A, Cantrell RG, Conway JM, Ray IM (2003) AFLP-based assessment of genetic diversity among nine alfalfa germplasms using bulk DNA templates. Genome, 46, 51–58.
- Smith JR, Carpten JD, Brownstein MJ et al. (1995) Approach to genotyping errors caused by nontemplated nucleotide addition by Taq DNA-polymerase. Genome Research, 5, 312–317.
- Sobel E, Papp JC, Lange K (2002) Detection and integration of genotyping errors in statistical genetics. American Journal of Human Genetics, 70, 496–508.
- Swenson JE, Sandegren F, Bjärvall A, Wabakken P (1998) Living with success: research needs for an expanding brown bear population. Ursus, 10, 17–23.
- Taberlet P, Camarra JJ, Griffin S et al. (1997) Noninvasive genetic tracking of the endangered Pyrenean brown bear population. Molecular Ecology, 6, 869–876.
- Taberlet P, Griffin S, Goossens B et al. (1996) Reliable genotyping of samples with very low DNA quantities using PCR. Nucleic Acids Research, 24, 3189–3194.
- Taberlet P, Luikart G (1999) Non-invasive genetic sampling and individual identification. Biological Journal of the Linnean Society, 68, 41–55.
- Taberlet P, Waits LP, Luikart G (1999) Noninvasive genetic sampling: look before you leap. Trends in Ecology and Evolution, 14, 323–327.
- Valière N (2002) gimlet: a computer program for analysing genetic individual identification data. Molecular Ecology Notes, 2, 377–379.
- Valière N, Berthier P, Mouchiroud D, Pontier D (2002) gemini: software for testing the effects of genotyping errors and multitubes approach for individual identification. Molecular Ecology Notes, 2, 83–86.
- Vekemans X, Beauwens T, Lemaire M, Roldan-Ruiz I (2002) Data from amplified fragment length polymorphism (AFLP) markers show indication of size homoplasy and of a relationship between degree of homoplasy and fragment size. Molecular Ecology, 11, 139–151.
- Vigilant L, Hofreiter M, Siedel H, Boesch C (2001) Paternity and relatedness in wild chimpanzee communities. Proceedings of the National Academy of Sciences USA, 98, 12890–12895.
- Vos P, Hogers R, Bleeker M et al. (1995) AFLP: a new technique for DNA fingerprinting. Nucleic Acids Research, 23, 4407–4414.
- Waits JL, Leberg PL (2000) Biases associated with population estimation using molecular tagging. Animal Conservation, 3, 191–199.
- Waits L, Taberlet P, Swenson JE, Sandegren F, Franzen R (2000) Nuclear DNA microsatellite analysis of genetic diversity and gene flow in the Scandinavian brown bear Ursus arctos. Molecular Ecology, 9, 421–431.
- Wang J (2004) Sibship reconstruction from genetic data with typing errors. Genetics, 166, 1963–1979.
- Wang DG, Fan JB, Siao CJ et al. (1998) Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science, 280, 1077–1082.
- Xu J, Turner A, Little J, Bleecker ER, Meyers DA (2002) Positive results in association studies are associated with departure from Hardy–Weinberg equilibrium: hint for genotyping error? Human Genetics, 111, 573–574.
- Yoder AD, Delefosse T (2002) The rise and fall and rise of ancient DNA studies. In: Ancient DNA, pp. 9–14. McGraw-Hill/Yearbook of Science and Technology, New York.