Escherichia coli is a normal inhabitant of the intestines of most animals, including humans. Some
E. coli strains can cause a wide variety of intestinal and extra-intestinal diseases, such as diarrhea, urinary tract infections, septicemia, and neonatal meningitis (
18). Phylogenetic analyses have shown that
E. coli strains fall into four main phylogenetic groups (A, B1, B2, and D) (
10,
21) and that virulent extra-intestinal strains belong mainly to group B2 and, to a lesser extent, to group D (
4,
7,
12,
19), whereas most commensal strains belong to group A. These studies have also given us a better understanding of how pathogenic strains acquire virulence genes (
4). Actually, phylogenetic grouping can be done by multilocus enzyme electrophoresis (
10,
20) or ribotyping (
2-4,
8), but both of these reference techniques are complex and time-consuming and also require a collection of typed strains.
Creation of a subtractive library for two
E. coli strains belonging to different phylogenetic groups (
6) and characterization of an anonymous 14.9-kb fragment strongly associated with neonatal meningitis strains (
3) suggested that certain genes or DNA fragments might be specific phylogenetic group markers. Three candidate markers were studied further: (i)
chuA, a gene required for heme transport in enterohemorrhagic O157:H7
E. coli (
6,
16,
23,
24); (ii)
yjaA, a gene initially identified in the recent complete genome sequence of
E. coli K-12, the function of which is unknown (
5); and (iii) an anonymous DNA fragment designated TSPE4.C2 from our subtractive library (
6). Here we describe a rapid technique for determining the phylogenetic groups of
E. coli strains based on PCR detection of the
chuA and
yjaA genes and DNA fragment TSPE4.C2. The method was evaluated by testing 230 strains that had already been grouped by using reference methods.
Bacterial strains and growth conditions.
The 72 strains of the ECOR collection (
17) were kindly provided by R. Selander (Department of Biology, University of Rochester, Rochester, N.Y.). These reference strains, isolated from a variety of hosts and geographic locations, are representative of the range of genotypic variation in the species. Sixty-eight of these strains belong to the four main phylogenetic groups (A, B1, B2, and D), and four are unclassified (
10,
21). We also tested a set of 86
E. coli strains causing neonatal meningitis (ECNM strains) (
4), 34
E. coli strains responsible for neonatal septicemia without meningitis, 30
E. coli strains isolated from feces of healthy neonates, the J96 uropathogenic
E. coli strain (O4:K6) (kindly provided by J. Hacker, Institut für Molekulare Infektionsbiologie, Würzburg, Germany) (
22), and 10 verotoxin-producing
E. coli O157:H7 strains, including 1 strain obtained from A. D. O'Brien (Uniformed Services University of the Health Sciences, Bethesda, Md.) and 9 strains isolated from different locations in France (P. Mariani, Hôpital Robert-Debré, Paris, France). The phylogenetic group distribution of 69 of the 86 ECNM strains has been published previously (
4). The other 17 ECNM strains and the 75 remaining clinical isolates were classified by ribotyping as previously described (
4). The
E. coli laboratory K-12 strain MG1655, which belongs to phylogenetic group A, was also studied (
10). Bacteria were grown at 37°C in Luria-Bertani broth or on Luria-Bertani agar. When necessary, ampicillin (100 μg per ml) was used.
PCR amplification and Southern blotting.
As a first step, PCR was performed with a standard protocol. Each reaction was carried out by using a 20-μl mixture containing 2 μl of 10× buffer (supplied with
Taq polymerase), 20 pmol of each primer, each deoxynucleoside triphosphate at a concentration of 2 μM, 2.5 U of
Taq polymerase (ATGC Biotechnologie, Noisy-le-Grand, France), and 200 ng of genomic DNA. The PCR was performed with a Perkin-Elmer GeneAmp 9600 thermal cycler with MicroAm tubes under the following conditions: denaturation for 5 min at 94°C; 30 cycles of 30 s at 94°C, 30 s at 55°C, and 30 s at 72°C; and a final extension step of 7 min at 72°C. The primer pairs used were ChuA.1 (5′-GACGAACCAACGGTCAGGAT-3′) and ChuA.2 (5′-TGCCGCCAGTACCAAAGACA-3′), YjaA.1 (5′-TGAAGTGTCAGGAGACGCTG-3′) and YjaA.2 (5′-ATGGAGAATGCGTTCCTCAAC-3′), and TspE4C2.1 (5′-GAGTAATGTCGGGGCATTCA-3′) and TspE4C2.2 (5′-CGCGCCAACAAAGTATTACG-3′), which generate 279-, 211-, and 152-bp fragments, respectively. In a simplified protocol, a two-step triplex polymerase reaction based on previously described methods (
1,
9) was assessed. The components of the reaction mixture were the same as those in the standard protocol, except that (i) DNA was directly provided by 3 μl of bacterial lysate or a piece of a colony, (ii) the six above-mentioned primers were mixed, and (iii) the PCR steps were as follows: denaturation for 4 min at 94°C, 30 cycles of 5 s at 94°C and 10 s at 59°C, and a final extension step of 5 min at 72°C.
Southern blotting was performed by capillary transfer to positively charged nylon membranes. Hybridization was performed at 65°C in 1% sodium dodecyl sulfate–1 M NaCl–50 mM Tris HCl (pH 7.5)–1% blocking reagent (Boehringer, Mannheim, Germany). The membranes were washed in 2× SSC for 15 min at room temperature, then in 2× SSC–0.1% sodium dodecyl sulfate for 30 min at 65°C, and finally in 0.1× SSC for 5 min at room temperature (1× SSC is 0.15 M NaCl plus 0.015 M sodium citrate). Chemiluminescence detection was performed according to the manufacturer's instructions (DIG Luminescent Detection Kit for nucleic acids; Boehringer). The probes were produced by PCR according to the manufacturer's instructions (PCR DIG Probe Synthesis Kit; Boehringer) by using the primers and amplification procedure described above for the standard protocol.
PCR grouping results.
A total of 230 strains were analyzed. Their phylogenetic groups, as determined by reference methods, were as follows: 43 belonged to group A, 23 belonged to group B1, 51 belonged to group D, and 113 belonged to group B2. Table
1 shows the PCR results for the entire set of strains according to their phylogenetic groups. The
chuA gene was present in all strains belonging to groups B2 and D and was absent from all strains belonging to groups A and B1. This allowed us to separate groups B2 and D from groups A and B1. In the same way, the
yjaA gene allowed perfect discrimination between group B2 (100% of the strains were positive) and group D (100% of the strains were negative). Finally, clone TSPE4.C2 was present in all but two of the group B1 strains and absent from all group A strains. All the PCR results were confirmed by Southern hybridization (data not shown). The results of these three amplifications made it possible to establish a dichotomous decision tree (Fig.
1) for phylogenetic grouping. With this tree, 228 of the 230 strains tested (99%) were correctly grouped, while only 2 strains were wrongly classified (group B1 strains were identified as group A strains). Identical results were obtained with standard and simplified PCR protocols. Figure
2 shows the different profiles obtained by triplex PCR for the four phylogenetic groups.
In this work, we developed a PCR method to rapidly determine the phylogenetic groups of
E. coli strains and obtained an accuracy of more than 99% compared to the reference method. Phylogenetic characterization of
E. coli strains on the basis of a very few phenotypic or genotypic features initially appeared to be very difficult. Such genotypic traits (presence or absence of a gene, for example) must meet different criteria for use in phylogenetic characterization. First, the gene must have been acquired or deleted when the group that it characterizes emerged. Second, the same gene must have been “stabilized,” thereby ruling out its subsequent deletion or horizontal transfer among bacteria belonging to other phylogenetic groups. Finally, recombination events in the candidate gene must be very rare. In other words, the gene product must not be targeted by natural selection, which favors new genetic recombinations (
24). Previous attempts to identify specific phylogenetic group characteristics based on phenotypic (
21) or genotypic features (
11) were not sufficiently discriminative. For the first time, we describe the use of two genes and an anonymous DNA fragment in a simple phylogenetic grouping method. Too little information is available on
yjaA and the DNA fragment to speculate on their evolutionary history. In contrast, the study by Wyckoff et al. (
25) of heme transport genes, together with our results, suggests that
chuA was acquired by sister groups B2 and D (
15) soon after their emergence rather than being present in a common ancestor and subsequently being lost by groups B1 and A.
However, two strains (ECOR 70 and an ECNM strain) belonging to phylogenetic group B1 were classified in group A by our method. This discrepancy may be explained by an intermediate genetic base between these two groups in these strains and by the fact that the region studied with our method (
chuA,
yjaA, and TSPE4.C2 are located at 78.7 min [
25], 90.8 min [
5], and approximately 87 min [
6], respectively, in relation to the genome of
E. coli K-12) may be more closely related to group A regions than to the regions studied with the reference methods. Indeed, it has been demonstrated that groups A and B1 are sister groups (
15). Moreover, recent multiple chromosomal nucleotide sequence analysis has shown that ECOR 70 may be considered a “hybrid” strain, in which some housekeeping genes exhibit nucleotide sequences shared by group A ECOR strains and some other genes exhibit nucleotide sequences shared by group B1 ECOR strains (
15; E. Denamur, personal communication). Thus, the phylogenetic group of ECOR 70 remains to be settled. In addition to the rapidity of our PCR-based method, no reference collection is required, meaning that the assay can easily be used in any laboratory. Furthermore, in contrast to other methods, group allocation is unequivocal. Indeed, the four unclassified strains in the ECOR collection, ECOR 31, ECOR 37, ECOR 42, and ECOR 43, were classified by our method; the first three strains belong to group D, and the fourth belongs to group A. It is noteworthy that all the sequences of the housekeeping genes studied in the latter strain were characteristic of those found in group A strains (Denamur, personal communication).
In conclusion, our simple and rapid phylogenetic grouping technique could have several practical applications. The first is in bioclinical practice, given the established link between phylogenetic group and virulence (
4,
7,
12,
19). The second is as a biotechnological screening tool for eliminating potentially pathogenic strains when candidate strains for cloning are screened. Such screening tools have been developed, for example, to identify
E. coliK-12 strains by PCR (
13) or to detect
E. colistrains with no known virulence genes by a reverse dot blot procedure described by Kuhnert et al. (
14). Our method has the advantage of being capable of identifying nonpathogenic strains other than
E. coli K-12 and suitable for large-scale strain screening. Thus, after all strains belonging to groups B2 and D, which are potentially pathogenic, are eliminated, the reverse dot blot technique (
14) could be applied to group A or B1 strains to eliminate rare strains harboring virulence factors.