INTRODUCTION
In recent years, the application of molecular tools such as IS
6110-based restriction fragment length polymorphism (RFLP), spoligotyping, and mycobacterial interspersed repetitive-unit (MIRU)–variable-number tandem-repeat (VNTR) analysis has revealed that infection by
Mycobacterium tuberculosis could be more complex than initially assumed. Several clonally complex phenomena have been observed, including mixed infections with more than one strain (
29,
30,
35), simultaneous presence of clonal variants (microevolution phenomena) (
1,
2), or even different distributions of strains/clonal variants infecting independent anatomical sites in the same patient (
6,
15).
Most reports of clonal complexity phenomena are descriptions of anecdotal cases and/or studies of clonal complexity are often restricted to contexts where overexposure to tuberculosis (TB) can be expected. This makes it difficult to evaluate the frequency at which we can expect this kind of infection to occur outside a high-incidence setting.
The main objective of our study was to systematically survey clonally complex infections (coinfections, microevolutions, and compartmentalizations) in unselected population-based samples so as to be able to calculate their proportion. We selected epidemiological contexts in which high infection pressure is not expected to facilitate clonally complex infections, namely, in settings with a moderate incidence of TB (Madrid and Almería, Spain; average incidence rates, 20.5 cases/100,000 inhabitants and 26.6 cases/100,000 inhabitants, respectively). Our secondary objectives were to perform a detailed genetic characterization of the strains/clonal variants involved in the clonally complex infections and to analyze the clinical and epidemiological backgrounds of the corresponding cases.
MATERIALS AND METHODS
Study samples.
Cases with microbiologically diagnosed respiratory TB were investigated for clonally complex infections (infections with two or more different strains/clonal variants infecting the same patient simultaneously). The population-based sample analyzed corresponded to all cases in which M. tuberculosis was isolated (initial diagnostic samples) from respiratory specimens between January 2003 and December 2009 at all hospitals in the province of Almería (Complejo Hospitalario Torrecárdenas, Hospital de Poniente, and Hospital La Inmaculada).
Cases in which M. tuberculosis was isolated from respiratory and extrapulmonary sites in the same episode (initial diagnostic samples) were investigated for compartmentalization (a clonally complex infection in which different strains/clonal variants are isolated from independent infected sites). The population-based sample corresponded to all cases fulfilling this requirement within the Almería sample mentioned above and cases from Hospital Gregorio Marañón (Madrid, area 1) during the same study period.
Microbiological methods.
Clinical specimens were processed by standard methods and inoculated on Lowenstein-Jensen slants and in mycobacterial growth indicator tube (MGIT) liquid medium (Becton Dickinson, Sparks, MD). The M. tuberculosis cultures were stored frozen at −70°C until analysis. Susceptibility testing for isoniazid, rifampin, streptomycin, and ethambutol was performed for the Almería isolates using a BacT/Alert 3D system (bioMérieux España SA, Madrid, Spain) and for the Madrid isolates using the mycobacterial growth indicator SIRE system (Becton Dickinson, Sparks, Maryland) according to standard methods.
Clonal complexity analysis.
Genotypic analysis to identify cases with clonally complex infections was based on MIRU-VNTR and IS6110-based RFLP (IS6110-RFLP) fingerprinting. Two isolates from the same patient were considered clonal variants when they shared highly similar genotypes (we allowed differences in one MIRU locus or in two loci if they shared identical or highly similar RFLP types [differences in one band]) and unrelated strains when their MIRU types differed in three or more loci and their RFLP types were markedly different (differences in more than six bands).
A case was considered to have a polyclonal infection (simultaneous presence of two or more clonal variants or coinfection with more than one strain) when we detected more than one amplification product for one or more MIRU-VNTR loci. In these cases, polyclonality was confirmed by repeating simplex PCR of the locus or loci involved. Additionally, single colonies were obtained after subculturing dilutions of the primary culture on 7H11 agar plates. Forty colonies were selected and inoculated in MGIT medium. After growth of the isolates, MIRU-VNTR was performed to confirm the presence of the allelic variants. After confirmation of polyclonality, independent colonies representing allelic variants were reinoculated on Lowenstein-Jensen slants, DNA was purified, and RFLP analysis was performed (
33).
In some cases with special findings, according to the clonal complexity of the infection by M. tuberculosis, we analyzed by the MIRU-VNTR and RFLP analysis methods additional isolates from independent clinical specimens (the number was variable depending on availability) from the same episode.
Genotyping methods. (i) MIRU-VNTR typing.
The analysis period included three sequential steps in the MIRU-VNTR procedures which were applied in our laboratory, and these influenced the MIRU-VNTR versions applied at each moment. First, a simplex MIRU analysis with 15 loci (MIRU-15) (
3,
11) was applied for the systematic survey of respiratory cases with mixed infection; it was then switched to a simplex MIRU analysis with 24 loci (MIRU-24) (
22), which was applied to analyze cases with respiratory-extrapulmonary TB. For those respiratory cases showing more than one MIRU-15 type, the extended MIRU format, MIRU-24, was also applied. Finally, multiplex MIRU-24 (
31) was implemented and the last 111 samples were analyzed by application of this format.
(a) Simplex PCR format. We analyzed the original 12 MIRU-VNTR loci following conditions published elsewhere (
32) and modified the amplification profile of the remaining 12 loci as described by Oelemann et al. (
22). Amplified products were run in an agarose gel (2% MS-8; Pronadisa, Madrid, Spain) at 45 V for 18.5 h.
(b) Multiplex PCR format. The final reaction mixture (50 μl) included 25 μl of PCR master mix (Qiagen multiplex PCR kit), 5 μl of Q solution (Qiagen multiplex PCR kit), and 0.25 μM each unlabeled and labeled oligonucleotide (3.9 μM for locus 4156). The primers used for PCR amplification were described by Supply et al. (
31). Amplification profiles were as described elsewhere (
22), except for the number of cycles (20 cycles). PCR products were analyzed by capillary electrophoresis (
4) using an ABI Prism 3100 genetic analyzer (Applied Biosystems, NLLab Centraal B.V., Haarlem, The Netherlands).
(ii) IS6110-based RFLP typing.
All the isolates from each of the patients showing differences in MIRU types were also analyzed by IS
6110-RFLP following international standardization guidelines (
33). RFLP types were used to establish identities/differences only when they had more than six IS
6110 copies.
(iii) LM-PCR.
The coinfecting clonal variants showing subtle differences between their RFLP types were analyzed by ligation-mediated PCR (LM-PCR) to map the IS6110 locations.
The protocol used was based on that of Prod'hom et al. (
23) with modifications. Briefly, the DNA was digested with the restriction enzyme XmaI (NEB) and ligated with adapter primers Rxma24/rxma12 (
22a; Rxma24 is 5′ AGC ACT CTC CAG CCT CTC AAC GAC 3′, and rxma12 is 5′ CCG GGT CGT TGA 3′) by incubation overnight with T4 DNA ligase (New England BioLabs, MA) at 16°C. The products were amplified by PCR with primers ISA1 and ISA3 (
21) and the linker primer Rxma24 using AmpliTaq Gold (Applied Biosystems). The PCR consisted of 35 cycles at 95°C for 45 s, 65°C for 45 s, and 72°C for 8 min. The amplified products were separated by electrophoresis in a 1.8% agarose gel and purified using GFX PCR DNA and a Gel Band purification kit (GE Healthcare; Buckinghamshire, United Kingdom). The purified fragments were sequenced with the ISA1 and ISA3 primers in a 3130xl genetic analyzer (Applied Biosystems, Carlsbad, CA). The IS
6110 insertion sites were mapped by investigating the homology of the LM-PCR product sequences with the H37Rv reference sequence genome in the TB Database (
25).
Population-based molecular epidemiology databases.
We used the genotypic databases from two population-based studies run in Madrid (area 1, 2003 to 2009) (
5) and Almería (2004 to today) (
20) in which all
M. tuberculosis isolates are labeled as clustered or orphan.
Clinical/epidemiological data.
Clinical and epidemiological information was obtained from clinical records. For cases coinfected by two independent strains, we looked for risk factors for TB, overexposure to other TB cases, and existence of previous TB. For the cases with the simultaneous presence of clonal variants, we looked for risk factors for TB, diagnostic delay (between onset of symptoms and diagnosis), and existence of previous TB.
DISCUSSION
Clonal complexity is increasingly accepted to be a feature of
M. tuberculosis infection. From an epidemiological point of view and to ensure precise tracking of recent transmission in molecular epidemiology programs, it is essential to identify those cases coinfected with more than one strain or clonal variant. Another important consideration is the possibility that strains with phenotypic differences (in virulence, infectivity, and susceptibility) can participate in clonally complex infections (
28), and this could have an impact on diagnosis, clinical practice, and therapy (
6,
17,
28,
29,
34).
The number of studies on clonal complexity in
M. tuberculosis infection has increased in recent years (
8,
10,
18). However, some of these studies examine anecdotal cases (
2) and others analyze this phenomenon only in specific
M. tuberculosis lineages (
35) or specific phenotypes (
34). Those which follow population-based designs to determine the proportion of clonally complex TB cases are performed mainly in epidemiological contexts with a markedly high incidence of TB (
8,
10,
29,
30,
35) and/or where the possibility of overexposure is more likely (
28).
We measured the frequency of clonally complex infections in epidemiological scenarios with a moderate incidence of TB. Another differential aspect of our study was the decision to independently analyze two versions of clonal complexity (coinfection with different strains and simultaneous presence of clonal variants) and to explore each of them in two kinds of patients, namely, those with respiratory TB and those with respiratory-extrapulmonary TB. This design allowed us to cover the spectrum of clonal complexity: coinfection with different strains, coexistence of clonal variants (likely due to microevolution phenomena), compartmentalized coinfections, and compartmentalized distribution of clonal variants. Our data demonstrate that clonal complexity in the infection by
M. tuberculosis is not anecdotal, especially in cases with simultaneous infection at independent sites. The real figures for clonal complexity in this study could be even higher if we had included more than a single sputum specimen in our screening design (
13,
18,
19,
28)
.
The application of MIRU-VNTR typing to investigate clonal complexity was key for exploration of these events in an extensive population-based sample (774 patients). Other standard strategies applied, such as analysis of multiple independent colonies (
12,
14) or observation of low-intensity bands by RFLP (
9), are not suitable for large samples. Besides, if RFLP had been selected as the screening strategy, we would not have identified most of the cases infected with clonal variants and even some of the cases infected by different strains according to MIRU data.
As for the nine cases with mixed infection by independent strains, we did not detect any cases with more than two strains. The presence of two independent strains in the same episode of a TB case could be considered the result of simultaneous coinfection, superinfection, or reactivation of an old infection coincidental with a recent infection (due either to a lack of containment of a previous infection in immunocompromised hosts or to the immune impairment associated with the new infection). It is not easy to associate a specific explanation with a particular patient. However, we tried to identify which cases fulfilled clinical and epidemiological features that could allow the options explained above. The hypothesis of coinfection/superinfection could be plausible for the five cases (all but one of whom were immigrants) who were either homeless or lived in shared substandard housing. As one case had already had TB, the possibility of reactivation/recent infection could also be considered. This case corresponded to an immigrant from Ghana who was coinfected with two different species,
M. tuberculosis and
M. caprae. Infections by
M. caprae are extremely uncommon in Spain (
26); therefore, the hypothesis of reactivation of
M. caprae by recent
M. tuberculosis infection seems plausible.
We explored a novel application to processing of the data obtained from the long-term molecular epidemiology surveys run in the study populations in order to find explanations for the cases of mixed infections by independent strains. We used these data to determine whether the strains involved in patients with mixed infections were involved in clusters (recent transmissions) or were orphan (reactivations of remote infections). An orphan strain coincided with a clustered strain in three cases, thus supporting the possibility of reactivation/recent infection. In another two cases, both strains were clustered, suggesting possible coinfection/superinfection.
As for cases with the simultaneous presence of clonal variants, most involved two variants, although four and five clonal variants were found in two cases. To identify these cases, it was necessary to consider MIRU and RFLP typing data together, because a MIRU type was split into two RFLP variants.
As with cases of mixed infection, we explored the clinical and epidemiological backgrounds of the 12 cases with clonal variants to evaluate three possible explanations: variants could have appeared during the active infection as a result of a diagnostic delay, variants could have appeared during the latent phase in cases who had previously had TB, and the patients were already infected by variants which had microevolved in previous hosts. In most cases, a diagnostic delay had occurred, and in one case it reached 31 months. Three cases had previously had TB; these included the cases with the highest number of variants (four and five variants; cases H and 11). Considering molecular and epidemiology data, in four cases none of the coinfecting variants was clustered and in six cases only one of the coinfecting variants was clustered, thus minimizing the probability of microevolution in hosts preceding the infection of the analyzed cases. The remaining two cases corresponded to the first year of the molecular survey; therefore, we lacked sufficient data to assign previous orphan/clustered status to the variants.
Identification of clonal variants may be considered a refined exercise with no clinical significance. However, compartmentalization is sometimes observed in cases with clonal variants. If these clonal variants are not equally distributed at the different infected sites, then these subtle genotypic differences could entail an adaptive advantage. In cases coinfected with two independent strains and with a compartmentalized infection, the strain infecting the extrapulmonary site has higher infectivity in
in vitro and
in vivo models (
15). Similarly, a genotypic change in a clonal variant could have made it able to more efficiently infect the extrapulmonary site, as in case F, in which only one of the two variants from sputum was detected in blood. The alternative explanation could be that the genotypic modification leading to the emergence of a clonal variant is the consequence of an adaptation of the strain to the specific circumstances of the extrapulmonary site. Examples of this hypothesis could be seen cases I and J, in which one variant not detected in sputum was found at an extrapulmonary site. The marked clonal complexity in case H with the simultaneous presence of four clonal variants and with a different clonal distribution in each of the three sites involved could result from the combination of the two possibilities mentioned above.
We mapped the IS
6110 sequences of the clonal variants which showed subtle differences between their RFLP patterns to evaluate whether potential functional differences could be considered. In one case, two variants differed in an IS
6110 sequence located in or absent from Rv1758. An IS
6110 copy in the same sequence has also been found in strain H37Rv, and it has been reported that it might have a dual impact: on the expression of both cutinase, an enzyme involved in the lipid metabolism (
27), and phospholipase
plcD (
16), an enzyme considered a virulence factor in certain studies (
24). For the second case infected with clonal variants who was analyzed, we detected an IS
6110 copy either present or absent in a region coding for the protein PPE34, which has recently been found to facilitate a shift toward the Th2 immune response, aiding the immune evasion by mycobacteria (
7).
Our study enabled us to assess that clonally complex M. tuberculosis infections must also be considered in populations not subject to high infective pressure. The systematic survey of clonally complex infections in a population-based sample allowed us to calculate that the proportion of cases in which they occur is higher than expected. The application of an extended scheme for genotypic characterization led to the detailed description of different modalities of clonal complexity in cases with either respiratory TB only or respiratory-extrapulmonary TB. The combination of genotyping data with clinical, epidemiological, and molecular data could help us to identify potential causes for these kinds of infections and provide more information on their clinicopathologic significance.