Volume 100, Issue 5 p. 930-938
Evolution and Phylogeny
Free Access

Genetic structure and domestication of carrot (Daucus carota subsp. sativus) (Apiaceae)

Massimo Iorizzo

Massimo Iorizzo

Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

Search for more papers by this author
Douglas A. Senalik

Douglas A. Senalik

Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

USDA-Agricultural Research Service, Vegetable Crops Research Unit, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

Search for more papers by this author
Shelby L. Ellison

Shelby L. Ellison

Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

Search for more papers by this author
Dariusz Grzebelus

Dariusz Grzebelus

Department of Genetics, Plant Breeding and Seed Science, University of Agriculture in Krakow, Al. 29 Listopada 54, 31-425 Krakow, Poland

Search for more papers by this author
Pablo F. Cavagnaro

Pablo F. Cavagnaro

CONICET and INTA EEA La Consulta, CC8 La Consulta (5567), Mendoza, Argentina

Search for more papers by this author
Charlotte Allender

Charlotte Allender

Warwick Crop Centre, University of Warwick, Wellesbourne, Warwick, CV35 9EF, UK

Search for more papers by this author
Johanne Brunet

Johanne Brunet

USDA-Agricultural Research Service, Vegetable Crops Research Unit, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

Department of Entomology, 1630 Linden Drive, University of Wisconsin, Madison, Wisconsin 53706-1598 USA

Search for more papers by this author
David M. Spooner

David M. Spooner

Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

USDA-Agricultural Research Service, Vegetable Crops Research Unit, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

Search for more papers by this author
Allen Van Deynze

Allen Van Deynze

Seed Biotechnology Center, University of California, 1 Shields Ave, Davis, California 95616-8816 USA

Search for more papers by this author
Philipp W. Simon

Corresponding Author

Philipp W. Simon

Department of Horticulture, University of Wisconsin-Madison, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

USDA-Agricultural Research Service, Vegetable Crops Research Unit, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706-1590 USA

Author for correspondence: (e-mail: [email protected])Search for more papers by this author
First published: 01 May 2013
Citations: 142

The authors thank the California Fresh Carrot Marketing Board and the Bejo, Nunhems, Rijk Zwaan, Takii, and Vilmorin seed companies for financial support. D.G. was supported by the Polish National Science Center, project no. N N302 062236 from 2009–2011.

Abstract

Premise of the study: Analyses of genetic structure and phylogenetic relationships illuminate the origin and domestication of modern crops. Despite being an important worldwide vegetable, the genetic structure and domestication of carrot (Daucus carota) is poorly understood. We provide the first such study using a large data set of molecular markers and accessions that are widely dispersed around the world.

Methods: Sequencing data from the carrot transcriptome were used to develop 4000 single nucleotide polymorphisms (SNPs). Eighty-four genotypes, including a geographically well-distributed subset of wild and cultivated carrots, were genotyped using the KASPar assay.

Key results: Analysis of allelic diversity of SNP data revealed no reduction of genetic diversity in cultivated vs. wild accessions. Structure and phylogenetic analysis indicated a clear separation between wild and cultivated accessions as well as between eastern and western cultivated carrot. Among the wild carrots, those from Central Asia were genetically most similar to cultivated accessions. Furthermore, we found that wild carrots from North America were most closely related to European wild accessions.

Conclusions: Comparing the genetic diversity of wild and cultivated accessions suggested the absence of a genetic bottleneck during carrot domestication. In conjunction with historical documents, our results suggest an origin of domesticated carrot in Central Asia. Wild carrots from North America were likely introduced as weeds with European colonization. These results provide answers to long-debated questions of carrot evolution and domestication and inform germplasm curators and breeders on genetic substructure of carrot genetic resources.

Human manipulation of wild progenitors during crop domestication has led to the foundation of modern agriculture (Zohary and Hopf, 2000; Glémin and Bataillon, 2009). A common suite of traits including loss of seed shattering, dormancy, and branching are often selected during the domestication process and referred to as the “domestication syndrome” (Zohary and Hopf, 2000). After primary traits have been selected and fixed, the process of domestication often has directed more attention to quality traits such as color, shape, and flavor, and physiological traits contributing to uniformity (Doebley et al., 2006). Studies analyzing the genetic structure of wild and cultivated crops, in combination with archeological and historical evidence, have provided insight into the geographic and temporal details of domestication to reveal where, when, and how many times a crop was domesticated (Meyer et al., 2012).

To date, the majority of domestication studies has focused on staple food crops with little attention toward root vegetables (Meyer et al., 2012). Cultivated carrot (Daucus carota L. subsp. sativus L.) is a common vegetable with a well-known and widely distributed weedy progenitor, wild carrot or Queen Anne's lace (D. carota subsp. carota). Carrot is the most widely grown crop of the family Apiaceae (Umbelliferae), cultivated on 1.2 M hectares globally (carrot and turnip as aggregate data) (FAO, 2011), of which 34000 hectares of carrots are produced in US with an estimated annual crop value of $758 M (USDA, National Agricultural Statistics Service, 2011). Wild carrot occurs widely across temperate regions of the globe.

The time frame and geographic region(s) of the first cultivation of carrots are unclear. Vavilov (1992, pp. 337–340) identified Asia Minor (eastern Turkey) and the inner Asiatic regions as the centers of origin of cultivated carrot and noted Central Asia (Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, Uzbekistan) as being “the basic center of Asiatic kinds of cultivated carrots” where “wild carrots … practically invited themselves to be cultivated”. As observed by the presence of carrot seed at prehistoric human habitations 4000 to 5000 yr ago (Newiler, 1931), it is speculated that wild carrot seed was used medicinally or as a spice (Andrews, 1949; Brothwell and Brothwell, 1969). Carrot was cultivated and used as a storage root similar to modern carrots in Afghanistan, Iran, Iraq, and perhaps Anatolia beginning in the 10th century (Mackevic, 1932; Zagorodskikh, 1939). On the basis of historical documents, the first domesticated carrot roots were purple and yellow and recorded in Central Asia, Asia Minor, then Western Europe and finally in England between the 11th and 15th centuries (Banga, 1963). Interestingly, orange carrots were not well documented until the 15th and 16th centuries in Europe (Banga, 1957a, 1957b; Stolarczyk and Janick, 2011), indicating that orange carotenoid accumulation may have resulted from a secondary domestication event.

To date, the origin(s) of carrot domestication has not been studied, and only a small number of studies have used molecular markers to examine carrot genetic diversity. Thus far, molecular data have not been able to uncover any clear population structure in carrot (Bradeen et al., 2002), although a distinction was detected between cultivated and wild carrot accessions using amplified fragment length polymorphism (AFLP) markers (Shim and Jørgensen, 2000). A wider characterization of cultivated carrot, using simple sequence repeat (SSR) markers and single nucleotide polymorphisms (SNPs) revealed a moderate separation between eastern and western cultivated carrots (Clotault et al., 2010; Baranski et al., 2012).

Here, we designed and screened 4000 single nucleotide polymorphisms (SNPs) in a geographically well-dispersed collection of wild and cultivated carrots to investigate their geographical substructure and domestication origins.

MATERIALS AND METHODS

Plant material and DNA preparation

Eighty-four D. carota accessions were used, including 30 wild and 45 cultivated accessions (subsp. carota and subsp. sativus, respectively), plus seven additional subspecies of D. carota and two accessions of D. capillifolius (Table 1; Appendix S1, see Supplemental Data with the online version of this article). Cultivated carrots included white, orange, purple, yellow, and red open-pollinated cultivars, but modern hybrid cultivars do not (Appendix S1). We coded wild and cultivated carrot by the country of origin or collection site into six and seven global geographic region categories, respectively (Table 1). We defined eastern carrot as accessions from Middle East and Asia, and western carrot as accessions from Europe, South America, and North America. Open-pollinated cultivars from North America and wild carrot from South America were not included because of lack of accessions in available collections.

Table 1. Summary of plant material used in this study.
Number of accessions
Geographic region Country Category code a Cultivated Wild
South America Argentina, Brazil C1 and W1 3
North America U. S. A. W2 6
Europe Czech Republic, France, Germany, Greece, Hungary, Italy, Netherlands, Portugal, Sweden, United Kingdom C3 and W3 13 10
North Africa Algeria, Morocco, Tunisia C4 and W4 2 5
Middle East Azerbaijan, Iran, Israel, Russia, Syrian Arab Republic, Turkey C5 and W5 11 4
Central Asia Afghanistan, Uzbekistan C6 and W6 4 5
South Asia India, Pakistan C7 7
North Asia China, Japan C8 5
Europe France, Italy, Portugal SSP 7
North Africa Tunisia SP 2
  • a Category codes correspond to online Appendix S1; Figs. 1, 2. C, Cultivated; W, Wild; SSP, Subspecies; SP, other species.

Seeds were planted and grown in pots in a greenhouse. Leaves from 1-mo-old plants were harvested and lyophilized. DNA from single plants was extracted as described by Murray and Thompson (1980) and quantified using pico green (Invitrogen, Pisley, UK). Mature roots were examined to confirm expected type (Appendix S1).

SNP detection and evaluation

Parameters used for bioinformatic analysis are outlined in Appendix S2 (see online Supplemental Data). Sequence data from a cultivated × wild carrot (B493×QAL) population (pooled samples) and two cultivated carrot inbred lines (B6274 and B7262) were used in this study (Iorizzo et al., 2011). Illumina reads, 18 to 21 M paired-end sequence pairs and 10 to 12 M shotgun reads from each genotype, with length of 61 nt, were mapped to 59493 carrot EST consensus sequences (Iorizzo et al., 2011) using the program Mosaik 1.1.0021 (Hillier et al., 2008) (Appendix S2, Section A). SNPs were detected with the program gigaBayes version 0.4.1 (Smith et al., 2008) using default parameters. In total, 6282846 SNPs were detected and subsequently filtered with the following criteria. With a custom Perl program, SNPs were preliminarily filtered for minimum read coverage and presence of flanking sequences with few nearby SNPs (Appendix S2, Section B). Depending on the allelic variation within and between each genotype, SNPs were categorized as either “M” for intrasample monomorphic and intersample polymorphic or “P” for polymorphic in both intra- and intersample comparisons (Iorizzo et al., 2011). A draft low-coverage assembly of Roche 454 sequences from the inbred line B493 (Iorizzo et al., 2012) was used to exclude potential duplicated sequences and splice junctions. Sequences that matched more than one contig were eliminated, producing 12421 sequences (Appendix S2, Section C). Next, groups of sequences that aligned to the same contig or scaffold were grouped, resulting in 7400 clusters (Appendix S2, Section C). SNPs from genes of interest (anthocyanins, carotenoids, transcription factors etc.) were given higher priority and filtered with less stringent parameters (Appendix S2, Section B).

All 12421 sequences were submitted to Kbioscience (http://www.kbioscience.co.uk) for design of one primer pair per cluster. In total, primer pairs for 4000 SNPs were designed and used for genotyping all samples. Considering the distribution of intra- and intersample polymorphism, the final set of SNPs included 2901 SNPs that were intra- and interpolymorphic in cultivated × wild and intramonomorphic in the two cultivated (PMM category, e.g., A/T or T/A in B493× QAL, A/A or T/T in both B6274 and B7262), 552 SNPs intramonomorphic and interpolymorphic (MMM category, e.g., AA in both B493×QAL and B6274, and TT in B7262) and 547 intramonomorphic in wild × cultivated and intra and interpolymorphic in cultivated (MMP and MPM, e.g., AA in both B493×QAL and B6274 and A/T in B7262). Sequences of the 4000 contigs used to design primers are reported in Appendix S3 (see online Supplemental Data). The list of primers and sequences are reported in online Appendix S4. All genotyping was performed by KBioscience (http://www.KBioscience.co.uk). SNPs were genotyped using the KASPar chemistry, a competitive allele-specific PCR SNP genotyping system using FRET quencher cassette oligos (http://www.KBioscience.co.uk/reagents/KASP/KASP.html).

Data analyses

Genetic diversity (He, i.e., expected heterozygosity), defined as the probability that two randomly chosen alleles from the population are different (Weir, 1996) was estimated using the PowerMarker program (Liu and Muse, 2005). We used the program structure version 2.3.3 (Pritchard et al., 2000; Hubisz et al., 2009) to estimate population structure. structure analyses were restricted to 75 wild and cultivated carrot accessions of D. carota subsp. carota and subsp. sativus. The analysis did not rely on prior population information (i.e., “USE-POPINFO” was turned off). Forty independent runs with a burn-in length of 50000 and a run length of 500000 were used for each number of genetic clusters from 1 to 8. The most likely number of clusters was determined based on both maximum likelihood described by Pritchard et al. (2000) and DeltaK method of Evanno et al. (2005). We attributed each accession to a given cluster when the proportion of inferred ancestry (q) was higher than 0.7 (70%). Significant differences in He between wild and cultivated accessions were determined with an unpaired two-tailed Student t test.

Analyses of molecular variance (AMOVA, φPT) were carried out with 999 data permutations using the program Genalex 6.4 (Peakall and Smouse, 2006). To reconstruct phylogenetic relationships, for each sample, we used the program MEGA 5.0 (Tamura et al., 2011) to concatenate and align all the SNPs (base calls). The analysis included all accessions used in this study. To establish the best-fit model of evolution, we then analyzed the concatenated SNPs using the program Modeltest (Posada and Crandall, 1998). The most appropriate model was the general time reversible model with among site variation approximated using the Gamma distribution and the proportion of Invariable sites (GTR+G+I).

Bootstrap support was evaluated by resampling the data 1000 times (Felsenstein, 1985). Seven wild D. carota subspecies (other than subsp. carota and subsp. sativus) were used to root the tree.

RESULTS

Marker screening and genetic diversity

Four thousand SNPs were evaluated in 84 individuals including 30 wild (W) and 45 open-pollinated cultivated (C) carrot from different regions of the world and nine Daucus carota subspecies (SSP) or other closely related Daucus species (SP) representing an outgroup (Table 1, Appendix S1). In total, 3636 SNPs (91%) produced distinct genotypic clusters within the Kbioscience (Hoddesdon, UK) platform that function with the KASPar assay. Using a cutoff value of 10% and 20% for missing data across genotypes and markers, respectively, we selected 3535 SNPs from the original 4000. Of these, 3326 were polymorphic and were used for further diversity analysis.

Overall genetic diversity (He) was 0.34 ± 0.15 (mean ± SD) for all wild and cultivated carrot accessions and 0.21 ± 0.21 for the outgroup. Interestingly, a global comparison of wild (D. carota subsp. carota) vs. cultivated (D. carota subsp. sativus) accessions demonstrated an estimated He value of 0.32 ± 0.17 and 0.32 ± 0.17, respectively, revealing no reduction in genetic diversity since domestication (t = 1.064, df = 6650, P > 0.28).

Due to the fact that initial SNP detection was based on comparisons among two cultivated accessions (B6274 and B7262) and a wild × cultivated sample (B493 × QAL) (Iorizzo et al., 2011), it is possible that ascertainment bias (Brumfield et al., 2003; Clark et al., 2005) could have been introduced in the development of the SNPs used in this study, causing an underestimation of minor alleles in the wild population. One method used to test the extent of ascertainment bias is to restrict the analysis to common alleles (minor allele frequency [MAF] >0.05) (De et al., 2008; Skoglund and Jakobsson, 2011). We selected a subset of 1581 SNPs that were polymorphic in the wild × cultivated hybrid, but were monomorphic in the other two cultivated lines, with MAF >0.05 in all wild and cultivated accessions. Re-estimating genetic diversity in wild and cultivated accessions revealed similar results to those obtained with the full SNP data set, with He values for wild and cultivated carrots of 0.38 ± 0.13 and 0.38 ± 0.12, respectively, confirming no significant reduction of genetic diversity (t = 0.558, df = 3160, P = 0.57) during domestication, and indicating that the SNP set used had minimal ascertainment bias.

Population structure

The log likelihood structure analysis (Pritchard et al., 2000) supported the presence of three genetically distinct clusters (CL) (K = 3) (Appendix S5, Section A1), but the ΔK method (Evanno et al., 2005) suggested two clusters (K = 2) as optimal (Appendix S5, Section A2) with the next largest peak at three clusters (K = 3). Similar studies using a large number of markers have found an inflation at K = 2 caused by the strong rejection of the null hypothesis at K = 1, which makes ΔK falsely maximal at K = 2 (Vigouroux et al., 2008). Cluster membership was assigned assuming the proportion of inferred ancestry q ≥ 0.7 (70%) for each accession. The presence of two clusters helped separate wild from cultivated accessions (Appendix S5, Sections B1, B2), with the exception of all wild carrot from Central Asia (W6), which clustered with cultivated carrot.

With K = 3 (Fig. 1), CL1 mainly included western cultivated accessions from Europe and CL2 mainly included eastern cultivated accessions from the Middle East, Central Asia, South Asia, and North Asia. As was the case also for K = 2, all wild accessions from Central Asia (W6) grouped together with cultivated accessions, in particular with those from the eastern cultivated carrots (CL2). In addition to European accessions, CL1 included two accessions from Asia (C7-1 and C8-5) and a wild/cultivated hybrid collected in North America (W2-6) that recently became feral and was originally classified as a wild accession. Population structure of the wild accessions, which were primarily members of the CL3, reflected the distinction found at K = 2, where all wild accessions except those from Central Asia and one accession from North America were distinct from cultivated accessions. With q <0.7 (70%), 11 accessions had high admixture (Fig. 1B).

Details are in the caption following the image

Percentage of membership (q) for (A) each of the clusters identified at K = 3 and (B) clusters as identified at q > 0.7 (70%). Asterisks in (A) indicate samples with a percentage of membership >70% that did not cluster with the original geographic location. Group codes: C, cultivated carrot; W, wild carrot; C1, South America cultivated; C3, European cultivated; C4, North Africa cultivated; C5, Middle East cultivated; C6, Central Asia cultivated; C7, South Asia cultivated; C8, North Asia cultivated; W2, North America wild; W3, European wild; W4, North Africa wild; W5, Middle East wild; W6, Central Asia wild.

To carry out analyses of molecular variance (AMOVA), samples were divided into three groups: wild, eastern cultivated and western cultivated, based on differentiation given by structure analysis. Wild species from Central Asia were included in the wild group. Their grouping by geographic distribution and their cultivation status (wild or cultivated) mainly matched their differentiation by structure. Most of the variation occurred within populations (82%, P = 0.001). Differentiation among the three groups, as measured by pairwise φPT, was also significant and moderate: 0.14 (P = 0.001) between eastern cultivated and western cultivated; 0.19 (P = 0.001) between wild and eastern cultivated and between wild and western cultivated.

Phylogenetic analysis

Maximum-likelihood analysis (Fig. 2) separated: the seven wild D. carota subspecies (other than subsp. carota) as outgroup (bootstrap = 85%); wild and cultivated carrot into two clades (bootstrap = 99%); with the cultivated clade splitting into western carrots (bootstrap = 99%) and eastern carrots (bootstrap < 75%). In addition to their geographic distribution, eastern and western carrots reflected the phenotypic separation of orange (western) vs. purple, yellow and red carrots (eastern). Within the western cultivated clade, all accessions from South America (C1) and North Africa (C4) were more closely related to each other than they were to European (C3) accessions; wild accessions from Central Asia (W6) formed a well-supported (bootstrap = 99%) sister clade to all cultivated carrots; wild accessions from Greece (W3-5, W3-6, W3-7) were more closely related to those from the Middle East (W5) than to western Europe (W3); wild accessions from North America (W2) grouped with those from Europe (W3), supporting common ancestry; and Daucus capillifolius was part of a clade with wild D. carota from North Africa, reflecting their common geographic origin, crossability, and 2n = 18 chromosome number (McCollum, 1975). Maximum likelihood and structure analysis are concordant regarding separation of wild and cultivated accessions, the geographic partitioning within the cultivated accessions, and division of the two samples from eastern cultivated carrot (C7-1; C5-8) and the wild/cultivated hybrid (W2-6) from North America more closely with European carrots.

Details are in the caption following the image

Maximum-likelihood tree of Daucus wild and cultivated species. Taxon codes correspond to online Appendix S1. Numbers on the branches indicate bootstrap support >75%. Black branches contain outgroups; green branches wild D. carota (subsp. sativus) and orange branches cultivated D. carota (subsp. carota). The colors of the dots correspond to colors in Fig. 1 indicating structure cluster membership at K = 3, where cluster 1 is noted as orange, cluster 2 yellow, cluster 3 green, and mixtures gray.

DISCUSSION

Origin of domesticated carrot

Our study provides the first molecular investigation of the location and number of domestication origins of carrot. The origin of cultivated carrot used as root storage has generally been accepted to be either Central Asia (Vavilov, 1992) or Asia Minor (Banga, 1957b). The earlier conclusions were reached after examining the first domestication syndrome traits, which include minimal lateral root branching and biennial growth habit which sustain nonwoody root growth, along with historical documents reporting use of carrot as a storage root. Our results strongly separate cultivated carrot from wild carrot and strongly place wild carrots from Central Asia as the closest genetic relatives of domesticated carrot, supporting Vavilov's (1992) hypothesis.

We designed our study with relatively equal geographic subsets to explore genetic structure of wild and cultivated carrot. The strong support for a single origin of cultivated carrot in Central Asia deserves analysis of additional accessions and techniques for firm conclusions of number of origins. If this holds, a single origin for carrot parallels the results of other domestication studies suggesting single origins for crops like barley (Badr et al., 2000), cassava (Olsen and Schaal, 2001), maize (Matsuoka et al., 2002), rice (Huang et al., 2012), potato (Spooner et al., 2005), and emmer wheat (Özkan et al., 2002), differing from crops with multiple origins such as common bean (Sonnate et al., 1994), cotton (Wendel, 1995), and squash (Decker, 1988).

To date, only a small number of studies have used molecular markers to examine genetic diversity within and between wild and cultivated carrot, and this study represents the first genome-wide characterization of molecular diversity in carrot based on transcribed genes. Both structure and phylogenetic results separated wild vs. cultivated accessions and highlighted a geographic structuring of both wild and cultivated carrots. In contrast with our results, Bradeen et al. (2002) used 163 primarily dominant AFLP and ISSR markers from wild and cultivated carrot across 18 countries to reveal that molecular diversity within wild and cultivated accessions was genetically nonstructured. The observation of nonstructured genetic diversity, as suggested by Bradeen et al. (2002), could be due to the outcrossing nature of the carrot mating system, combined with a relatively small number of molecular markers. Molecular evidence for reduced differentiation among populations with an outcrossing mating system were provided by Savolainen et al. (2000), Muir et al. (2004), Glemin et al. (2006) and Rong et al. (2010). More recently, Baranski et al. (2012) used 30 SSR markers to study the genetic structure of cultivated carrot. They found a moderate separation between eastern and western cultivated carrot, suggesting that codominant markers could help to resolve genetic structure among accessions. Our data confirm the power of codominant markers and, in addition, demonstrate the impact that a large number of markers can have to more clearly resolve genetic differentiation among and within wild and cultivated subpopulations.

Several domestication models have been proposed to simulate/predict possible complications when analyzing genome-wide molecular data to investigate domestication origins. One model developed by Allaby et al. (2008) found that an annual crop with an effective population size of 1500 individuals and no recombination would appear monophyletic 98% of the time after approximately 3000 yr (2N generations). However, given the fact that carrot domestication occurred only ∼1000 yr ago and that carrot is an outcrossing species, where the effective population size is usually larger than self-pollinated species (Charlesworth, 2003; Glemin et al., 2006), it is unlikely that the genetic signatures of the domestication syndrome were lost. Ross-Ibarra and Gaut (2008) argued that Allaby's models have unrealistically small effective population sizes, do not consider genetic mutations, and found through simulation tests that multiple domestications do not appear monophyletic. Following these observations, with the phylogenetic analysis of samples included in our study, mutations were accounted for under a transition/transversion evolutionary model. In addition, several authors (Smith, 2001; Olsen and Gross, 2008) have suggested that for understanding the dynamics of crop domestication, we will always benefit by combining genetic and historical data. Along with historical documents of carrot domestication, these considerations support a single origin of cultivated carrot in Central Asia.

Origin of orange pigment accumulation

Several hypotheses have been proposed to explain the origin of orange carrots: (1) Vilmorin (1859) concluded that orange carrots were selected in Europe directly derived from wild carrots; (2) Small (1978) and Thellung (1927) discussed the possibility that orange carrot had a Mediterranean origin, resulting from a hybridization event with D. carota subsp. maximus; (3) Banga (1957b) concluded that orange carrots were selected from yellow cultivated carrots; and (4) Heywood (1983) concluded that orange carrots were hybrids between European cultivated and wild carrots. It should be noted that none of these hypotheses were based on genetic analysis, but rather were based on taxonomic interpretations, historical documents, and geographical distribution of wild carrot and orange cultivated carrot. From the 10th through 18th centuries, phenotypic selections of domesticated carrot root color were, perhaps surprisingly to people today, yellow and purple. Orange root color, through the accumulation of high levels of carotenes, could be considered a secondary domestication event or a selection from cultivated carrot. Written documents describing orange carrots did not appear until 1721, with the description of the “Long Orange” and several “Horn” types (Banga, 1957b, 1963), although the orange carrot appeared in numerous Renaissance paintings as early as 1515 (Stolarczyk and Janick, 2011) and perhaps earlier. In this study, we used open-pollinated cultivated varieties of diverse colors (orange, yellow, purple, red, and white). Our results demonstrated that wild carrots from Europe and all wild subspecies (including D. carota subsp. maximus), all of which have white roots, grouped into two separate clades that are phylogenetically distinct from all cultivated carrot, contrary to the hypotheses of Vilmorin (1859), Thellung (1927), Small (1978), or Heywood (1983). In fact, considering the recent selection for orange root color in carrot, it is unlikely that the genetic footprint of a hybridization event between cultivated carrot and European wild or a wild subspecies would have had time to disappear. In contrast, the fact that orange carrots used in this study form a sister clade with all other cultivated carrots (yellow, red, and purple) supports the idea that orange carrot was selected from cultivated carrot. Furthermore, genetic evidence suggests that two recessive genes, y and y2, play a major role in the accumulation of yellow and orange carotenoids in the root (Just et al., 2009). This observation, along with our study, provides support for Banga's (1957b) hypothesis that orange root color was selected out of yellow, domesticated carrots.

Carrot diversity, breeding system and gene flow

Artificial selection of wild progenitors during domestication often results in a genome-wide reduction of genetic diversity within cultivated crops (Tanksley and McCouch, 1997). Such a reduction of genetic diversity caused by domestication has been observed in soybean (Hyten et al., 2006), rice (Londo et al., 2006), barley (Morrell and Clegg, 2007), maize (Tenaillon et al., 2004), and wheat (Haudry et al., 2007). In contrast to these crops, we observed no apparent reduction in the genetic diversity of cultivated vs. wild carrot. Similar results were reported by St. Pierre and Bayer (1991) following an isozyme marker analysis of cultivated and wild carrot. Einkorn wheat (Kilian et al., 2007), chicory (Van Cutsem et al., 2003), and pepino (Blanca et al., 2007), represent other crops where no reduction of genetic diversity or bottleneck has been reported with domestication.

The strength of a genetic bottleneck during domestication depends on the duration of the domestication event, mating system of the organism, and breeding practices. The mating system of carrot is outcrossing with severe inbreeding depression observed in both wild and cultivated carrot. The facts that carrot is predominantly an outcrossing species and never clonally propagated (Rong et al., 2010) and that outcrossing species can maintain high levels of diversity within populations (Glémin et al., 2006) could help explain the maintenance of genetic diversity in carrot cultivars. Carrot breeders, like those of other outcrossing crops, used an open-pollinated breeding approach prior to the development of hybrid cultivars in the 1950s (Simon, 2000). While a physical separation of 4 to 5 km is recognized as necessary to maintain genetic purity of diverse carrot seed stocks today (Simon, 2000), it would seem likely that the need for this isolation was not recognized, or not able to be maintained, by early carrot breeders, such that diverse root colors were likely intercrossing routinely during carrot domestication to generate a wide range of phenotypic and genetic variation. Analyses of genetic diversity among domesticated crops have found open-pollinated varieties to maintain a higher level of genetic diversity than hybrid cultivars (Rauf et al., 2009). It is therefore quite probable that since open-pollinated seed production was used to propagate carrot during domestication, a high level of genetic diversity could have been maintained.

As part of the outcrossing mating system, bidirectional gene flow between wild and cultivated carrot could have also played a role in retaining genetic diversity during domestication. Wild carrot is found in most areas where cultivated carrot seed production has historically occurred, and consequently, gene flow between from wild to cultivated carrot likely occurred throughout the history of this crop as suggested by Simon (2000). Gene flow from wild to cultivated carrot is readily recognized as white off-types in orange-rooted cultivars, and commercial seed producers since the 1850s (e.g., Vilmorin, 1859) have noted clear evidence of gene flow from wild to cultivated carrot. Perhaps orange carrot storage root color became popular in the 16th and 17th centuries as a visual tool for carrot breeders to keep cultivated carrot relatively free from outcross contamination. In contrast, outcrosses of yellow- or purple-rooted carrots with wild carrot yield hybrids with reduced color intensity, but clear identification of outcrosses is difficult (Simon, 1996). Because of documented gene flow between wild and cultivated carrot, it is possible that a substantial number of alleles may be shared by these two groups. However, both our study and previous studies using molecular and phenotypic data (Small, 1978; Bradeen et al., 2002) clearly separate wild and cultivated accessions, suggesting that artificial selection against nonadapted phenotypes may have limited gene flow between wild and cultivated.

These observations establish a possible scenario for why there was no apparent reduction of allelic diversity during carrot domestication. Potentially, after domestication, the introgression of alleles from wild carrots, combined with the use of an open-pollinated breeding system, maintained high levels of genetic diversity in domesticated carrot. With the discovery of cytoplasmic male sterility (CMS) (Welch and Grimball, 1947), carrot breeding has dramatically changed from an open-pollinated system to a hybrid-based system. The SNP resources developed in our study will be useful for assessing the genetic diversity in modern hybrid carrot cultivars to further investigate a potential reduction of genetic diversity that may accompany that recent change in breeding system.

Origin of wild carrot in the New World

Wild carrot occurs widely across North America where it is commonly known as Queen Anne's lace. However, the origin of its introduction has been unclear. One hypothesis is that wild carrot was introduced to the New World earlier as a weed from Asia when the first human migration was made to North America, over 15000 yr ago. Two additional hypotheses are that carrot was introduced as a weed or as an escape from cultivation with the arrival of European settlers. The potential for the second scenario is supported by the evidence founded by Magnussen and Hauser (2007) of adult plants derived from the hybridization between cultivated and wild carrots growing wild in close proximity to carrot fields.

The distant date of a possible introduction from Asia with the first human immigration could have resulted in a loss of the ancestral Asian genetic footprint, since there may have been enough time to accumulate mutations resulting in a distinctive North American genetic footprint. But this scenario would likely still show evidence of a relationship to its Asian progenitor. The proportion of alleles of inferred ancestry, as determined by structure analysis, demonstrated that wild accessions from North America were mainly in the wild cluster rather than cultivated cluster (Fig. 1). In addition, phylogenetic analysis demonstrated that wild carrot from North America (W2) grouped in the same clade as wild carrot from Europe (W3) (Fig. 2). These findings support the hypothesis that North American wild carrot was introduced from Europe as a weed, but they do not support either a hypothesis of Asia origin or a hypothesis of escape from cultivated.

Wild carrot populations and related taxa

Our results strongly place (98% bootstrap) D. capillifolius with the wild carrot accessions from northern Africa, suggesting that D. capillifolius is more closely related to D. carota than previously classified. This result is supported by an analysis of available species within Daucus with multiple nuclear orthologs (Spooner et al., in press). Daucus capillifolius was described by Gilli (1958) from a population collected in northwestern Libya. Daucus capillifolius is distinguished by yellow corollas and long-narrow leaf segments unlike anything else known in Daucus, but the fruit, which have long narrow spines and the strongly contracting (“birdsnest”) fruiting umbel, is typical of D. carota. McCollum (1975) made hybrids between these two species and found them to be fully intercrossable, fertile, and morphologically intermediate, and suggested that D. capillifolius may best be treated as a subspecies of D. carota. Sáenz Laín (1981) and Spalik and Downie (2007) provided further support for D. capillifolius to be treated as a subspecies of D. carota.

The rapid advancement in high-throughput SNP genotyping technologies, along with next-generation sequencing (NGS) technologies, has provided essential genomic resources for accelerating the molecular understanding of biological properties. This rapid development has decreased the cost, improved the quality of large-scale genome surveys, and allowed specialty crops such as carrot to access these technologies (Egan et al., 2012). The present study has provided new insights into carrot domestication and establishes a new framework to explore the genome of carrot and its relatives at functional, structural, and evolutionary levels.