Volume 88, Issue 10 p. 1888-1902
Systematics
Free Access

Granule-bound starch synthase (GBSSI) gene phylogeny of wild tomatoes (Solanum L. section Lycopersicon [Mill.] Wettst. subsection Lycopersicon)

Iris E. Peralta

Iris E. Peralta

Vegetable Crops Research Unit, USDA, Agricultural Research Service, Department of Horticulture, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706–1590 USA

Current address: Department of Biological Sciences, National University of Cuyo, Almirante Brown 500, C. C. 7, 5505 Chacras de Coria, Lujan, Mendoza, Argentina (e-mail: [email protected]).

Search for more papers by this author
David M. Spooner

David M. Spooner

Vegetable Crops Research Unit, USDA, Agricultural Research Service, Department of Horticulture, University of Wisconsin, 1575 Linden Drive, Madison, Wisconsin 53706–1590 USA

Author for reprint requests (e-mail: [email protected]).

Search for more papers by this author
First published: 01 October 2001
Citations: 146

This paper represents a portion of a Ph.D. thesis submitted to the Plant Breeding and Plant Genetics Program of the University of Wisconsin, Madison. The authors especially thank Charles Rick for his advice and for choosing and providing the accessions from the C. M. Rick Tomato Genetic Resources Center; committee members Paul Berry, Robert Hanneman, Michael J. Havey, Kenneth J. Sytsma, and Phillip W. Simon for research and editorial advice; Harvey Ballard, Peter Crump, Brian Karas, Joe Kuhl, Sabina Lara Cabrera, Jason Lilly, Cindy Muralles, and Sarah Stevenson for technical or statistical advice; Gregory Anderson for providing accessions used in our study; and Sandra Knapp, Rodger Evans, and two unknown reviewers for editorial comments. The Fulbright-LASPAU Fellowship Program, the National Scientific and Technological Research Council of Argentina (CONICET), the National University of Cuyo (UNC), and the USDA funded research. Names are necessary to report data. However, the USDA neither guarantees nor warrants the standard of the product, and the use of the name by the USDA implies no approval of the product to the exclusion of others that may also be suitable.

Abstract

Eight wild tomato species are native to western South America and one to the Galapagos Islands. Different classifications of tomatoes have been based on morphological or biological criteria. Our primary goal was to examine the phylogenetic relationships of all nine wild tomato species and closely related outgroups, with a concentration on the most widespread and variable tomato species Solanum peruvianum, using DNA sequences of the structural gene granule-bound starch synthase (GBSSI, or waxy). Results show some concordance with previous morphology-based classifications and new relationships. The ingroup comprised a basal polytomy composed of the self-incompatible green-fruited species S. chilense and the central to southern Peruvian populations of S. peruvianum, S. habrochaites, and S. pennellii. A derived clade contains the northern Peruvian populations of S. peruvianum (also self-incompatible, green-fruited), S. chmielewskii, and S. neorickii (self-compatible, green-fruited), and the self-compatible and red- to orange- to yellow-fruited species S. cheesmaniae, S. lycopersicum, and S. pimpinellifolium. Outgroup relationships are largely concordant with prior chloroplast DNA restriction site phylogenies, support S. juglandifolium and S. ochranthum as the closest outgroup to tomatoes with S. lycopersicoides and S. sitiens as basal to these, and support allogamy, self-incompatibility, and green fruits as primitive in the tomato clade.

Wild tomatoes (Solanum L. subsect. Lycopersicon, an autonym of Solanum section Lycopersicon (Mill.) Wettst.) are entirely American in distribution, growing in the western South American Andes from central Ecuador through Peru to northern Chile and in the Galapagos Islands, where the endemic species S. cheesmaniae grows (Fig. 1; see Table 1 for taxonomic authorities). In addition, S. lycopersicum, the wild ancestor of cultivated tomatoes, is more widespread and perhaps more recently distributed into Mexico, Colombia, Bolivia, and other South American countries (Rick and Holle, 1990). Wild tomatoes grow in a variety of habitats, from near sea level to >3300 m in elevation (Rick, 1973; Taylor, 1986). These habitats include the arid Pacific coast to the mesic uplands in the high Andes. All species are diploid (2n = 24; Rick, 1979). Breeding systems vary from allogamous self-incompatible to facultative allogamous and self-compatible to autogamous and self-compatible (Rick, 1963, 1979, 1984, 1986; Table 1). The self-incompatibility system in tomatoes is gametophytic and controlled by a single, multiallelic S locus (Tanksley and Loaiza-Figueroa, 1985).

Linnaeus (1753) first recognized tomatoes in the genus Solanum, but Miller (1754) recognized tomatoes in the genus Lycopersicon Miller. Miller's recognition of Lycopersicon continues to be used by the majority of botanists and plant breeders today. However, a minority of authors (MacBride, 1962; Seithe, 1962; Heine, 1976; Fosberg, 1987; Child, 1990; Spooner, Anderson, and Jansen, 1993; Bohs and Olmstead, 1999; Knapp and Spooner, 1999; Olmstead et al., 1999) treat tomatoes in Solanum (Peralta and Spooner, 2000). Molecular data from chloroplast DNA restriction site and sequence data (Spooner, Anderson, and Jansen, 1993; Bohs and Olmstead, 1997, 1999; Olmstead and Palmer, 1997; Olmstead et al., 1999) firmly place tomatoes and potatoes as sister taxa, justifying this treatment on phylogenetic grounds. The continued maintenance of Lycopersicon can be justified only on convenience and the maintenance of nomenclatural stability. Börner (1912) also recognized the close affinity between tomatoes and potatoes and proposed a new genus Solanopsis to include them. D'Arcy (1987) and Lester (1991) also recognized the close relationship of the genera Solanum and Lycopersicon but maintained them as distinct for nomenclatural stability.

The two most recent and complete taxonomic treatments of tomatoes are those of Müller (1940) and Luckwill (1943; Fig. 2). Müller (1940) considered Lycopersicon as a distinct genus and divided it into two subgenera, Eulycopersicon C. H. Müll., with two species possessing glabrous and red- to orange-colored fruits, flat seeds, bractless inflorescences, and leaves without pseudostipules, and Eriopersicon C. H. Müll., with four species bearing pubescent green fruits, thick seeds, bracteate inflorescences, and leaves usually with pseudostipules. Three years later, Luckwill (1943) adopted the same supraspecific categories but recognized different infraspecific taxa and five species in the subgenus Eriopersicon (Fig. 2). These treatments have become outdated as the number of species and races collected from South America has increased (Rick, 1971, 1991; Holle, Rick, and Hunt, 1978, 1979; Taylor, 1986).

Rick (1960, 1979) proposed an infrageneric classification based mainly on crossing relationships. He recognized nine wild tomato species, classified into two complexes. The Esculentum complex includes seven species, mainly self-compatible, and easily crossed with the cultivated tomato. Within this group three species, L. esculentum, L. pimpinellifolium, and L. cheesmaniae, have mostly glabrous, pigmented fruits, while the others have pubescent, green fruits. The Peruvianum complex includes the self-incompatible species S. peruvianum and S. chilense, which have pubescent green fruits that seldom cross with L. esculentum.

In a recent taxonomic treatment, Child (1990; Fig. 2) placed the tomatoes in Solanum subgenus Potatoe (G. Don) D'Arcy, section Lycopersicon, subsection Lycopersicon, and segregated them into three series: Eriopersicon (C. H. Müll.) Child, Lycopersicon, and Neolycopersicon (Correll) Child. He also hypothesized that Solanum subsection Lycopersicoides Child and section Juglandifolium (Rydb.) Child are the closest relatives of section Lycopersicon. Different criteria used in classification, morphology, and crossability, therefore, have led to different numbers of species, subspecies, and varieties and to conflicting hypotheses of interspecific relationships (Fig. 2).

Molecular data provide numerous and independent characters useful for plant phylogenetic inference (Sytsma, 1990). Single or low-copy nuclear genes are of great interest in molecular systematics because they are potentially most informative for species-level comparisons. Nuclear gene introns, such as in the genes Phy, Pgi, adh-1, and GapA, evolve faster and have been considered useful for angiosperm phylogenetic studies (Soltis and Soltis, 1998). In the perennial relatives of soybean (Glycine) sequences from the histone H3-D intron provided useful characters for assessing relationships at the species level (Doyle, Kanazin, and Shoemaker, 1996).

The phylogenetic utility of the structural nuclear gene for granule-bound starch synthase (GBSSI or waxy gene) was first explored by Mason-Gamer and Kellogg (1996) in grasses. The intron–exon regions revealed an interesting structure with potential utility at different taxonomic levels. The GBSSI introns provided more phylogenetically informative characters, and higher levels of variation, than the ITS region in grasses (Mason-Gamer, Weil, and Kellogg, 1998). A similar sequence variation was found among Ipomoea species comparing ITS with three exons (9, 10, 11; Fig. 3) and two introns, comprising in total 651 nucleotide sites, of the 3′ end of GBSSI (Miller, Raush, and Manos, 1999). Nevertheless, ITS sequences showed more variation than GBSSI sequences within the genus Physalis, although only 700 base pairs (bp) of GBSSI were analyzed (Whitson and Manos, 1999).

Many studies documented GBSSI as a single-copy gene in cereals (Shure, Wessler, and Fedoroff, 1983; Klösgen et al., 1986; Rohde, Becker, and Salamini, 1988; Wang et al., 1990; Clark, Robertson, and Ainsworth, 1991) and in dicots (van der Leij et al., 1991; Dry et al., 1992; Mérida et al., 1999; Wang, Yeh, and Tsai, 1999). It has been shown that GBSSI is a low-copy number gene in allotetraploid cassava, but it was unclear whether it is a single gene with different alleles at a locus or has more than one GBSSI locus (Salehuzzaman, Jacobsen, and Visser, 1993). The Rosaceae (Evans et al., 2000) GBSSI contains more than one copy of the GBSSI locus. The GBSSI gene was cloned in potato (van der Leij et al., 1991) and comprises 4663 bp, has 13 introns, and encodes a 58.2 kilodalton mature protein with 540 amino acids. Because potatoes and tomatoes are supported as sister groups by morphology and cpDNA data, the GBSSI gene sequence information of van der Leij et al. (1991) was used to design primers in tomatoes. Our study is the first phylogenetic application of GBSSI in the Solanaceae using the 5′ portion of this nuclear gene, from the first until the eighth exon, including seven introns and comprising 1311 nucleotides.

The main goal of our research was to examine the phylogenetic relationships of all nine wild tomato species and closely related outgroups, with a concentration on the most widespread and variable tomato species, Solanum peruvianum, using DNA sequences of the structural gene granule-bound starch synthase. A secondary goal was to compare these results to previous systematic treatments in wild tomatoes that use morphology (Müller, 1940; Luckwill, 1943; Child, 1990), crossing relationships (Rick, 1963, 1979, 1986), isozyme (Rick and Fobes, 1975a, b; Rick and Tanksley, 1981; Rick, 1983, 1986; Bretó, Asins, and Carbonell, 1993), or molecular data (Palmer and Zamir, 1982; McClean and Hanson, 1986; Miller and Tanksley, 1990).

MATERIALS AND METHODS

Plants

In the ingroup (Solanum section Lycopersicon subsection Lycopersicon) three accessions per species were analyzed for S. cheesmaniae, S. chmielewskii, S. chilense, S. habrochaites, S. lycopersicum var. cerasiforme, S. neorickii, S. pennellii; five accessions for S. pimpinellifolium; and 39 for S. peruvianum. Many more accessions of S. peruvianum were analyzed because this species is the most widespread and has the greatest degree of morphological, molecular, and crossability variation (Rick, 1963, 1979, 1986; Miller and Tanksley, 1990). One accession of a cultivated tomato, S. lycopersicum CV ‘Mountain Spring,’ was also included. The outgroups were selected according to the cpDNA results of Spooner, Anderson, and Jansen (1993). Like the ingroup, they are all diploid at 2n = 24 (Table 2). Two accessions per species were included in the outgroup taxa S. juglandifolium, S. lycopersicoides, S. ochranthum, S. sitiens and one accession per species in the outgroup taxa S. bulbocastanum, S. etuberosum, S. jamesii, S. muricatum and S. palustre. In total, 79 accessions were used for the phylogenetic analysis (Table 1). All the tomato (Solanum sect. Lycopersicon subsection Lycopersicon) accessions and outgroups in sect. Lycopersicon subsection Lycopersicoides and sect. Juglandifolium were obtained from the Tomato Genetics Resource Center, Department of Vegetable Crops, University of California, Davis. All potato accessions (Solanum sect. Petota) and members of sect. Etuberosum were obtained from the United States Potato Introduction Station, National Research Support Program-6 (NRSP-6), Sturgeon Bay, Wisconsin, USA (Bamberg et al., 1996); one accession of sect. Basarthrum was obtained from Dr. Gregory Anderson, University of Connecticut, Storrs, Connecticut, USA. Dr. Charles Rick, curator of the Tomato Genetics Resource Center, provided advice for the choice of tomato accessions based on geographic distribution, morphology, genetic diversity, and breeding behavior. Vouchers for these accessions were deposited at DAV, MERL, PTIS, and WIS.

DNA isolation and purification

Total genomic DNA was isolated from young leaves of single plants using a mini-extraction procedure kindly provided by Dr. Dario Bernacchi from Dr. Stephen Tanksley's laboratory at Cornell University, Ithaca, New York, USA. Three to eight small leaflets were transferred to a 1.5-mL plastic tube, frozen with liquid nitrogen ground, and then 750 μL of extraction buffer was added. The DNA extraction buffer was prepared with 1 volume of buffer A (1275.4 g Sorbitol, 242.2 g Tris, 37.22 g EDTA Na2 salt, 20 L of ddH2O, pH = 8.26, and 3.802 g/L of Na-bisulfite [the pH was adjusted to 7.5 after the addition of Na-bisulfite]), 1 volume of buffer B (nuclei lysis buffer: 200 mL of 1 mol/L Tris, 200 mL of 0.25 mol/L EDTA, 400 mL of 5 mol/L NaCl, 20 g of CTAB, 200 mL of ddH2O, pH 7.5), and 0.4 volumes of 5% Sarcosyl (50 g/L of N-lauroylsarcosine). The sample was incubated in a 65°C water bath for 30–60 min. The tube was filled with chloroform : isoamyl alcohol (24 : 1), vortexed vigorously, and microcentrifuged at 10 000 rpm for 5 min. The aqueous phase was transferred into a new 1.5-mL tube, and 2/3–1 volume of cold isopropanol was added to precipitate the DNA. The sample was microcentrifuged at 10 000 rpm for 5 min, the supernatant was discarded, and the DNA was washed with 70% ethanol. The DNA was resuspended in 50 μL of TE (10 mL of 1 mol/L Tris, pH 8, 4 mL of 0.25 mol/L EDTA, pH 7, 986 mL of ddH2O). The crude total DNAs were cleaned using acetate salts (7.5 mol/L ammonium acetate and 2.5 mol/L sodium acetate). Purified DNAs were used as templates for PCR amplification using the DNAeasy Plant Mini Kit (QIAgen Valencia, California, USA).

Generation of GBSSI sequences

The oligonucleotide primers were developed based on the GBSSI sequences of S. tuberosum obtained from GenBank. Two oligonucleotides were designed as external primers (Fig. 3) for the polymerase chain reaction (PCR), in the first exon region at position 136 (5′-GATGGGCTCCAATCAAGAACTAAT-3′) and in the eighth exon at position 1639 (5′-GCCATTCACAATCCCAGTTATGC-3′) of the published sequence of GBSSI (van der Leij et al., 1991), using the software program Oligo (Rychlik, 1998). Within this region (1311 bp) there are seven introns (from second to eighth introns) of lengths 84, 84, 101, 83, 80, 90, and 83 bp, respectively (total of 602 bp; van der Leij et al., 1991). This region was amplified using PCR for all accessions. In all cases, a single PCR product was obtained (Fig. 3).

Amplification reactions were conducted in a total volume of 50 μL with a Perkin Elmer programmable thermal controller. Each reaction comprised Promega (Madison, Wisconsin, USA) reaction buffer (1× final concentration), MgCl (2.5 mmol/L), dNTPs (200 μmol/L each), the two external primers (0.4 mmol/L each), bovine serum albumin (0.4 mmol/L), template DNA (50–100 ng/μL), and Promega Taq DNA polymerase (2.5 units/μL).

The amplification reaction was conducted for 4 min at 94°C, then 5 cycles of 30 sec at 94°C, 1 min at 50°C, and 1.5 min at 72°C, then 30 cycles of 30 sec at 94°C, 1 min at 55°C, and 1.5 min at 72°C, with a 10-min extension at 72°C, ending with 4°C hold. To determine whether amplification was successful, samples were run on a 1.0–1.2% agarose minigel stained with ethidium bromide and viewed with ultraviolet light. As a control, a lambda clone of the GBSSI gene from potato, kindly provided by Dr. Richard Visser (Wageningen, The Netherlands), was used. A DNA ladder (Gibco BRL low mass DNA ladder; Life Technologies, Rockville, Maryland, USA) was included to estimate the length and concentration of the amplified products.

Successful reactions were purified for cycle sequencing using cleaning columns (QIAgen Valencia, California, USA). Concentration of amplified cleaned product was checked with a fluorimeter and adjusted to 10–30 ng per 20 μL reaction. Cycle sequencing reactions were carried out using the ABI Prism Dye Terminator Cycle Sequencing Kit Reaction with AmpliTaq DNA Polymerase FS, and with Big Dye (Perkin-Elmer, Applied Biosystems, Foster City, California, USA) according to manufacturer's instructions.

Four internal primers (Fig. 3) were designed based on potato Genbank sequence to match conserved exon regions, beginning at position 174 (GBSSI A: 5′-CAAGATGGCATCCAGAACTGAGA-3′), 622 (GBSSI B: 5′-CACTGCTATAAACGTGGGGTTGA-3′), 1555 (GBSSI CR: 5′-GGCATAGTATGGGCTCACTGTAA-3′), and 1126 (GBSSI DR 5′-GGAATGAGAGCTGTGTGCCAATC-3′) of the published sequence of GBSSI (van der Leij et al., 1991). Sequences were obtained from both the forward (A, B) and reverse (CR and DR) primers.

Cycle sequencing was conducted for 25 cycles, each cycle consisting of 96°C for 30 sec, 46°C for 15 sec, and 60°C for 4 min. Sequencing products were precipitated in 50 μL of 95% ethanol and 2 μL of 3 mol/L sodium acetate, and then vacuum-dried. The samples were resuspended in formamide loading buffer (84% formamide, 4 mmol/L EDTA, 8 mg/mL Blue Dextran) and electrophoresed on a Perkin-Elmer Applied Biosystems 377XL automated DNA sequencer.

Some of the GBSSI sequences showed polymorphisms, particularly in the third intron, which was difficult to align. A similar protocol was used to amplify GBSSI in the problematic accessions, but in this case using Pfu polymerase (Promega, Madison, Wisconsin, USA), which produces less error during the PCR amplification than Taq polymerase. Cleaned PCR products were then cloned into Promega's pGEM-T Easy vector. The vector contain an A 3′ terminal thymidine overhang, and T7 and SP6 RNA polymerase promoters flanking multiple cloning regions including the α-peptide coding region of the enzyme β-galactosidase and multiple restriction sites. Because Pfu polymerase produces blunt ends, an A-tailing reaction was performed using Taq DNA polymerase before the ligation. Ligation, transformation, and plating were done according to the protocols provided by the manufacturer. Insertional inactivation of the α-peptide allows recombinant clones to be directly identified by color screening on indicator plates. Plasmid preparations were done with alkaline lysis (Wizard plus plasmid purification systems; Promega, Madison, Wisconsin, USA). Purified plasmid DNA was digested with EcoRI, and cut and uncut products were checked on 0.9% agarose gel for presence of inserts. A lambda control of GBSSI was included to check the size of the band. Two molecular ladders were also included (Gibco BRL low and high mass DNA ladders; Life Technologies, Rockville, Maryland, USA), and concentrations of the inserts were estimated by visual comparisons of band intensity. Plasmid minipreps containing the GBSSI fragment were cycle sequenced as described above. Six to 12 clones were sequenced for each individual plant per accession (Table 2).

Sequence alignment

DNA sequences initially were edited using Sequencher software (Gene Codes Corporation, Ann Arbor, Michigan, USA). These initially aligned sequences were then imported into PAUP* version 4.0d64 (Swofford, 1998) and were further aligned visually.

Phylogenetic analysis

Parsimony analyses of the DNA sequences were conducted with PAUP* 4.0d64 (Swofford, 1998). Solanum muricatum was used to root the trees based on the results of Spooner, Anderson, and Jansen (1993) and Olmstead and Palmer (1997). The first analyses were performed using Fitch parsimony, which allows free reversibility (change of characters in any direction among the nucleotide states). All the characters were included and were equally weighted, and gaps were considered as missing data. An initial tree was generated by jackknife (36.8% deletion, 10 000 replicates, and fast stepwise addition) (Farris et al., 1996). This tree was used as a starting tree for further heuristic searches using tree-bisection-reconnection (TBR) swapping algorithm, with retention of all equally parsimonious trees (MULPARS) with stepwise addition and steepest descent. The amount of homoplasy was evaluated with the consistency index (CI; Kluge and Farris, 1969) and the retention index (RI; Farris, 1989). Consensus trees were generated from all most parsimonious trees. Bootstrap analysis (Felsenstein, 1985), using 1000 replicates, and jackknife, also with 1000 replicates (36.8% deletions; Farris et al., 1996), were conducted to estimate the internal relative support for each branch. Also, a decay analysis (Bremer, 1988; Donoghue et al., 1992) was calculated using an heuristic TBR search. Bootstrap and decay values are placed on the strict consensus tree (Fig. 5).

In the second analysis, a similar heuristic search was performed but the insertion and deletion characters (indels) were scored as multistate characters attached to the nucleotide site data (Baum, Sytsma, and Hoch, 1994). The CI and RI, as well as the bootstrap values and decay support, were calculated and compared with the previous trees.

RESULTS

GBSSI sequence variation within individuals

About 15% of the 79 accessions showed some sites with two, and sometimes three, overlapping nucleotide peaks in the same position in the GBSSI sequence. These polymorphic sites were always found within the intron regions, mainly at the beginning of the third intron. To ascertain the nature of these variants, the GBSSI DNA fragments were cloned and sequenced from seven of these individuals that displayed infraindividual polymorphism (S. cheesmaniae LA1450, S. chilense LA1930, S. peruvianum LA110, LA111, LA1336, LA1556, LA2152). Table 2 lists the number of GBSSI clones that were sequenced from these seven individuals, the number of distinct sequence types found, and their nucleotide variation and position. A putative microsatellite (AT)n was found in the third intron, being n = 1 in potato and n = 2 up to n = 9 in the other accessions.

For some accessions (e.g., S. chilense LA1930 and S. cheesmaniae LA1450) all clones were identical, yet these differed from the sequences of the PCR product generated using Taq polymerase. We assumed that the sequences obtained from the clones were the most accurate, because Pfu polymerase produces less error. These sequences were included in the cladistic analysis. Sequences from five accessions of S. peruvianum differed by substitutions in the exon and intron regions and deletions/insertions only in the intron regions (Table 2). We initially included all sequence types in a cladistic analysis, but most were so similar (differing only by 1–3 base pairs) that they made no difference in the cladistic results. Others differed by up to 3 base pairs and were entered separately in the final analysis (indicated in italic boldface, Figs. 4, 5). In all cases they remained in the same part of the cladogram. In addition, cloned sequences of S. pennellii LA1376 has complex polymorphisms suggesting to us it is a possible hybrid; we do not include these results here but are investigating this further.

Cladistic results

The aligned data matrix consisted of 1384 characters. Additionally we scored 20 indels as binary or multistate characters. Indels with potential phylogenetic information were more frequent in the introns, and few were detected in the exons. The GBSSI intron sequences are at base pair positions 22–185, 267–380, 480–622, 713–830, 895–983, 1085–1176, 1287–1374 of the aligned sequences (data available in Peralta [2000] and at: http://ajbsupp.botany.org/v88/peralta.html). Most gap characters were present in the outgroups (18 indels). In the phylogenetic analysis of the data, considering gaps as characters and using 134 phylogenetically informative characters, 15 000 equally most parsimonious trees were saved (length = 347, CI = 0.88, RI = 0.95; Fig. 4). Identical strict consensus trees (Fig. 5) were generated considering gaps as missing data or as characters.

Bootstrap analysis using 1000 replicates gave an excellent support for the outgroups (86–100%). Almost identical support values were found with the jackknife method. The ingroup had much less resolution and consisted of a basal polytomy, from which some branches were resolved, and a terminal clade. The basal polytomy was composed entirely of the self-incompatible green-fruited species S. chilense (map localities 29, 36, 38; Fig. 1), the central to southern Peruvian populations of S. peruvianum (24 populations, plus three additional sequence types, map localities 18, 19, 21, 22, 23, 24, 28, 29, 30, 32, 33, 34, 35, 36, 37), S. habrochaites (map localities 3, 11, 24), and S. pennellii (map localities 19, 28, 31). The latter two species are resolved within this basal polytomy, with bootstrap values of 86 and 96%, respectively. Three other branches of S. peruvianum were supported within this basal polytomy with bootstrap values of 62–99%. Two of these branches contain populations of S. peruvianum in the same generalized geographic area (map location 21) or contiguous generalized areas (21, 22, and 23).

The derived clade (70% bootstrap value) contains 14 northern Peruvian populations of S. peruvianum (also self-incompatible, green-fruited, map localities 6, 9, 10, 11, 12, 13, 15), and one central Peruvian population of S. peruvianum (LA1609, map locality 22), S. chmielewskii and S. neorickii (self-compatible, green-fruited, map localities 25, 26, 27 and 7, 17, 20, respectively), and the self-compatible and red- to orange-fruited species S. cheesmaniae, S. lycopersicum (map localities 5, 21, 32), and S. pimpinellifolium (map localities 2, 7, 8, 23). The latter three species cluster together as a polytomy within this advanced clade, but include one northern accession of S. peruvianum (LA2163, green-fruited, self-incompatible). The derived clade includes two groups of S. peruvianum with bootstrap values higher than 50%, and a branch corresponding to the red- to orange- to yellow-fruited self-compatible species, with a bootstrap value of 24%, containing S. cheesmaniae, S. lycopersicum, S. pimpinellifolium, and one accession of S. peruvianum. For branches with <50% bootstrap support the decay index = 1, for branches between 50 and 70% the decay index = 2, and for branches between 70 and 95% the decay index = 3 (Fig. 5).

DISCUSSION

The GBSSI sequences provided good resolution at the outgroup level and helped clarify relationships among five of the nine sections recognized by Child (1990) within Solanum subgenus Potatoe. Those results are entirely concordant with the cpDNA phylogeny of Solanum (Spooner, Anderson, and Jansen, 1993) regarding interspecific relationships of Solanum section Lycopersicon subsection Lycopersicoides, sect. Juglandifolium, sect. Basarthrum, sect. Petota, and sect. Etuberosum. The only difference is the cladistic separation of the three tuber-bearing members of sect. Petota, S. tuberosum on one branch and S. bulbocastanum + S. jamesii on a different branch, but this likely is a taxa sampling-density problem.

The GBSSI sequences provided less phylogenetic information at the ingroup than at the outgroup level, but still provided insights into taxonomic relationships among tomatoes. These results support S. juglandifolium and S. ochranthum as the closest outgroup to tomatoes, with S. lycopersicoides and S. sitiens as basal to these taxa, and support allogamy, self-incompatibility, and green fruits as primitive.

Mating system have played an important role in the evolution of wild tomatoes. Self-incompatibility has been considered as an ancestral condition to self-compatibility in tomatoes and probably never reversed to self-incompatibility (Rick, 1982). In this evolutionary context, changes from self-incompatibility to self-compatibility are expected to arise frequently and independently in different lineages (Rick, 1982). Interestingly, S. habrochaites and S. pennellii have both self-incompatible and self-compatible populations. The self-incompatible populations have higher genetic variation and occupy the center of their species geographic distribution. Self-compatible populations occur toward the northern and southern edges of distribution and have less genetic variation and smaller flower parts (Rick, Fobes, and Tanksley, 1979; Rick and Tanksley, 1981; Rick, 1984). The change from self-incompatibility to self-compatibility has been reported in only one population of S. peruvianum LA2157 from Chotano, Peru (Rick, 1986). Self-compatibility and the red to orange to yellow fruit color in S. cheesmaniae, S. lycopersicum, and S. pimpinellifolium have been considered as derived synapomorphies for this monophyletic group (Palmer and Zamir, 1982). The two closely related species, S. chmielewskii and S. neorickii, are also self-compatible. The latter species is exclusively autogamous with low population genetic variation and small flowers with stigmas included in the anther tube. In contrast, the facultative allogamous S. chmielewskii has larger flower parts and higher levels of heterozygosity. It has been postulated that S. neorickii evolved from S. chmielewskii (Rick et al., 1976). All the populations of S. chilense, and the most closely related species outgroups, S. lycopersicoides, S. sitiens, S. ochranthum, and S. juglandifolium, are exclusively self-incompatible. In Solanum sect. Petota, sect. Etuberosum and sect. Basarthrum self-compatibility most probably arose independently.

Within the ingroup, the phylogenetic analyses of GBSSI sequences identified S. pennellii and S. habrochaites as well-supported branches within the basal polytomy. Solanum peruvianum is supported as paraphyletic, with the taxa separating into two geographical groups, one in northern Peru (“northern group”) and another into the central and southern Peru (“southern group”). All the red- to orange- to yellow-fruited self-compatible autogamous species (S. cheesmaniae, S. lycopersicum, and S. pimpinellifolium) and the green-fruited facultative allogamous species (S. chmielewskii and S. neorickii) are derived and on the same clade as these northern populations of S. peruvianum.

The GBSSI sequences presented very low variation in the autogamous species, compared with the considerable diversity found in all allogamous species. These results are in concordance with isozyme data (Rick and Fobes, 1975a, b; Rick and Tanksley, 1981; Rick, 1983, 1986; Bretó, Asins, and Carbonell, 1993) and nRFLP data that showed self-incompatible species to harbor, on average, ten times more genetic variation within accessions than the self-compatible ones (Miller and Tanksley, 1990). In fact, more variation was found within a single accession of the self-incompatible species S. peruvianum than among all accessions tested of any one of the self-compatible species (Miller and Tanksley, 1990). The estimated amount of DNA polymorphism reflects breeding systems, consequently less within-species variation was found in the selfing species than in the outcrossing ones (Wolfgang and Langley, 1998). Similarly, low levels of DNA polymorphism were found in molecular linkage maps based on a cross between two closely related autogamous species, S. lycopersicum and S. pimpinellifolium, compared with the higher level of polymorphisms found in maps based on intraspecific crosses between S. lycopersicum and other allogamous species (Chen and Foolad, 1998). Results from different molecular markers demonstrated the difficulty of finding polymorphisms among self-compatible species. More variable markers, such as microsatellites used to identify tomato cultivars (Broun and Tanksley, 1996; Smulders et al., 1997), may be better suited to compare relationships among autogamous tomato species.

The low level of genetic variation found within self-compatible species may be explained by the role of autogamy that drives the loss of variation and fixation of alleles (Rick, 1984). The low variation among the self-compatible species showed by cpDNA (Palmer and Zamir, 1982), allozyme (Bretó, Asins, and Carbonell, 1993), nRFLP (Miller and Tanksley, 1990), molecular linkage maps (Chen and Foolad, 1998), and GBSSI sequences may reflect recent lineage divergence (Moniz de Sá and Drouin, 1996).

The maternal phylogenetic results from cpDNA restriction sites of Palmer and Zamir (1982) showed that the red- to orange- to yellow-fruited self-compatible autogamous species (S. cheesmaniae, S. lycopersicum, and S. pimpinellifolium) formed a clearly derived monophyletic group. The nRFLP showed the same results (Miller and Tanksley, 1990). The same group has been classified as series Lycopersicon in a morphological treatment by Child (1990; Fig. 2). The GBSSI data also were concordant in supporting the derived nature of the red-fruited autogamous species, and the monophyly of this group was confirmed (but with a low bootstrap value of 24%), with the exception of inclusion the single accession of S. peruvianum LA2163 from Yamalúc (Cajamarca Province, Peru). This northern accession came from the same area and is closely related genetically to the only self-compatible population of this species, S. peruvianum LA2157 from Tunel Chotano (Cajamarca Province, Peru). These northern Peruvian populations have been considered ancestral stocks (Rick, 1986), and S. peruvianum LA2163 may represent a progenitor population of this derived clade.

Critical results revealed by the GBSSI data are the paraphyly of S. peruvianum and the close relationship of the northern populations with the self-compatible taxa, also supported by nRLFP (Miller and Tanksley, 1990). The morphological and crossability differences between the northern and southern populations of S. peruvianum have long been recognized. The northern populations were separated as Lycopersicon [Solanum] peruvianum var. humifusum C. H. Müll. based on plants with decumbent stems, very short trichomes, and leaves with few leaflets (Müller, 1940). Rick (1963, 1986) demonstrated that these northern races were partially crossable among themselves but with reduced crossability to the southern populations that grow south of 8° S latitude (Rick, 1986). In addition, Miller and Tanksley (1990) showed that these two groups could be distinguished by nuclear restriction fragment length polymorphism data, although only one population of the northern group was analyzed, S. peruvianum var. humifusum LA2150 (not included in our GBSSI study). Within the northern S. peruvianum clade, only one central Peruvian population, S. peruvianum LA1609, was also found. Interestingly, this accession is geographically and morphologically related to other two coastal races of S. peruvianum, LA107 and LA1373 (not included in our GBSSI study), that have been used as crossing bridge populations between the northern and the southern populations of S. peruvianum (Rick, 1963, 1986). The inclusion of S. peruvianum LA1609 within the northern population clade suggests a possibility of naturally occurring gene exchange between northern and southern groups.

The GBSSI gene sequence results also showed the close relationship of two self-compatible species, S. chmielewskii and S. neorickii, with the northern S. peruvianum group. Similarly, the nRFLP supported a northern population, S. peruvianum var. humifusum LA2150, as the most closely related to S. chmielewskii and S. neorickii. In the maternal phylogenetic results from cpDNA restriction sites, S. chmielewskii also appeared closely related to a northern population, S. peruvianum LA1032 (not included in our GBSSI study), and Palmer and Zamir (1982) proposed the inclusion of S. chmielewskii as subspecies of S. peruvianum. In the phylogenetic GBSSI gene sequence results, the allogamous self-incompatible S. chilense was included within the basal polytomy with the southern races of S. peruvianum. A similar relationship was found with cpDNA restriction sites, and Palmer and Zamir (1982) also proposed the inclusion of S. chilense as subspecies of S. peruvianum.

Mitochondrial DNA (mtDNA) restriction fragment length studies were used to address relationships of nine Lycopersicon species and two closely related Solanum species (McClean and Hanson, 1986). The divergence estimates were based on the ratio between the number of shared fragments and all fragments evaluated. The mtDNA divergence is higher than that in cpDNA, indicating that the DNA of the two organelles is evolving at different rates (McClean and Hanson, 1986). Tomato mitochondrial genomes exhibit a small amount of divergence, and the variability was more limited within the self-compatible S. lycopersicum (0.04%) than between this species and the self-incompatible S. sitiens (2.7%). The inferred phylogeny from the GBSSI data confirms the position of this latter taxa as an outgroup species. Nevertheless, cpDNA, nRFLPs, and GBSSI results were not in agreement with the species relationships showed by mitochondrial results. These differences can be explained by the possibility of mitochondrial genome rearrangements that could result in estimates of divergences greater than the true values (McClean and Hanson, 1986). The incorrect assessment of homology of the mtDNA bands is another possible explanation that may account for the species relationship incongruence among data sets.

In agreement with crossing data, GBSSI gene sequence results distinguished the northern and southern S. peruvianum groups and also the close relationship of this species with S. chilense (Rick, 1963, 1979, 1986). Nevertheless, the molecular results of this study do not support the placement of S. pennellii and S. habrochaites within the Esculentum group (Rick, 1979), and S. chmielewskii and S. neorickii were more closely related to the northern S. peruvianum group than to the red- to orange- to yellow-fruited self-compatible species. The GBSSI results support a close relationship among green-fruited species, in agreement with morphological treatments of wild tomatoes (Müller, 1940; Luckwill, 1943).

Multiple GBSSI clones obtained from a single individual had identical sequences for S. chilense LA1930 and S. cheesmaniae LA1450 (Table 2). Nevertheless, multiple clones obtained from a single PCR performed on five different individuals, each one representing five S. peruvianum accessions, differed by nucleotide substitution and deletion/insertion events (Table 2). In four cases (S. peruvianum LA110, LA111, LA1336, and LA2152) more than two GBSSI sequence variants were identified. Theoretically, in diploid wild tomato species, a single nuclear gene as GBSSI has up to two different alleles in the heterozygote. True allelic variation at a single locus could explain the presence of two distinct sequences of GBSSI. However, the presence of more than two variants could be caused by artifactual errors that may occur as a consequence of nucleotide misincorporation during the PCR amplification of the GBSSI gene (Shimada and Tada, 1991). Another possibility to explain the observed intraindividual variation is the presence of more than one GBSSI locus (Salehuzzaman, Jacobsen, and Visser, 1993; Evans et al., 2000). In spite of the variation found among clones, all the variants used in with the cladistic analysis fell within the same clade.

The GBSSI DNA data, like other molecular markers used in tomatoes to date, were not able to completely differentiate very closely related, and most probably recent divergent species, especially the self-compatible ones. The GBSSI provide useful information about intraspecific variation in S. peruvianum and about character evolution regarding breeding systems and fruit color. A future paper will use these same accessions to investigate the morphological support for species boundaries and relationships.

Table 1. Accessions of wild tomatoes (Solanum sect. Lycopersicon [Mill.] Wettst. Subsection Lycopersicon) and outgroups examined for morphological and GBSSI gene sequence variation. Vouchers are deposited at BM, MERL, PTIS, and WIS. We used the Solanum equivalents of the Lycopersicon species as recognized by Rick, Laterrot, and Philouze (1990), with the addition of the varieties and forms of S. cheesmaniae, S. habrochaites, S. pennellii, and S. peruvianum as listed in the database of germplasm at the C. M. Rick Tomato Genetics Resource Center (http://tgrc.ucdavis.edu)
image
Table 1. Continued
image
Table 1. Continued
image
Table 1. Continued
image
Table 2. Number of distinct GBSSI sequences generated from different clones obtained form a single PCR product of a single individual. Each individual represents one of the seven accessions listed below. The number of identical clones is indicated in parentheses. Numbers indicate the exon or intron position as illustrated in Fig. 3. A, C, G, T are nucleotides, and nucleotide substitutions are indicated with a slash. For example, all 12 clones sequenced for GBSSI for S. chilense LA 1930 were identical. However, for S. peruvianum LA110, we obtained six distinct GBSSI sequences types in 10 clones and four of those were identical clones for a sequence with a G/A transition in intron 7, a T/A transversion in intron 8, and an ATA deletion in intron 3
image
Details are in the caption following the image

Distribution of the wild tomato accessions used in this study. Map numbers correspond to generalized map localities in Table 1. A line drawn at ∼10° S indicates the northern (N) and southern (S) edges of distribution of “north” and “south” Solanum peruvianum populations that reflect the phylogenetic relationships showed in Fig. 5

Details are in the caption following the image

Comparison of classifications of Solanum L. section Lycopersicon [Mill.] Wettst. subsection Lycopersicon. Species in boldface have red to orange to yellow fruits and those without boldface have green fruits. The numbers in parentheses represent numbers of infraspecific ranks (subspecies, varieties, and forms). The lines connect synonymous taxa

Details are in the caption following the image

Schematic of the structural potato GBBSI (waxy) gene analyzed in this study. The promoter sequence (TACAAAT), translational start (ATG), stop codon (TAA), and polyadenylation sites (polyA) are indicated. The black box represents an untranslated exon, shaded boxes the 13 translated exons and interconnecting lines introns. The fragment obtained by the polymerase chain reaction comprises ∼1525 base pairs (from the first to the eighth exon) and is indicated with a line below the schematic. Arrow heads indicate position and direction of primers 5′ and 3′ that were used to amplify the fragment and forward primers A and B and reverse primers CR and DR that were used for cycle sequencing. The scale represents the total length of complete genomic nucleotide sequence of potato GBBSI (modified from van der Leij et al., 1991)

Details are in the caption following the image

One representative phylogram of 15 000 equally parsimonious trees (indels treated as characters), length = 347, consistency index = 0.88, retention index = 0.95. Different sequence types are indicated in italic boldface. Numbers indicate branch lengths, and unnumbered branches have a length of 1

Details are in the caption following the image

Strict consensus tree of 15 000 most parsimonious trees (indels treated as characters) based on GBSSI sequences. Percentage of 1000 bootstrap replicates is given above branches. Decay values between 1 and 3 are in parentheses; the other branches in the outgroup are at least 13 steps longer. Distinct GBSSI sequence types from different clones obtained from single individuals of seven accessions are indicated in italic boldface and a number (see Table 2). Map localities correspond to Fig. 1 and Table 1; G represents accessions from the Galapagos Islands, and S represents unmapped continental accessions. “North per” refers to accessions from the northern range of S. peruvianum, while “south per” refers to accessions from central and southern populations of S. peruvianum, as shown in Fig. 1