Introduction

Huntington disease (HD) is a late onset autosomal dominant neurodegenerative disorder characterised by motor impairment, cognitive decline and psychiatric disturbances associated with selective neuronal death in the striatum and cortex.1 The age of onset of disease symptoms varies considerably between patients and death usually occurs 15–20 years after clinical onset of the disorder.2

The prevalence of HD is approximately five in 100 000 worldwide,2 but in South Africa the prevalence of HD has been reported as approximately two per 100 000 in both the Mixed Ancestry and Caucasian population subgroups.3, 4 No prevalence data has since been compiled and it is possible that these figures are underestimates of the true statistics today.

Groups from several countries, including India, Japan, Scotland, Spain, Greece and Sweden, have performed haplotype analysis around the HD mutation.5, 6, 7, 8, 9, 10 The group from Sweden identified multiple haplotypes within their population; however, 89% of that cohort carried a common core haplotype consisting of markers within the HD gene (one of which was the CCG repeat located immediately 3′ to the CAG repeat).6 The same (CCG)7 allele has been found to be associated with most of the HD mutations studied to date; that is, Scotland, Greece, India and Spain.5, 7, 8, 9 The only distinguishing population appears to be Japan, where the majority of the expanded (CAG) repeats were associated with a (CCG)10 allele.10

In 1980, a South African genealogical study proposed that the HD mutation had been brought to the country approximately 300 years ago, with the arrival of a gentleman of Dutch origin (WS van der Merwe).11 As a result of this comprehensive epidemiological and genealogical study, it was suggested that the progeny of Mr van der Merwe ‘spread’ the HD gene throughout the country. However, an important aspect of this hypothesis is that this mutation was responsible for most of the Afrikaans-speaking HD families in South Africa. Genealogically, within the Caucasian South Africans (of European origin) there are multiple genetic influences, the majority of which come from north-western Europe. In addition, there are several individuals with HD who descend from French immigrants who came to South Africa via Mauritius.12 Therefore, HD in the Caucasian population within South Africa originates from several European countries.

HD has been well documented in a second population group, specific to South Africa. The traditional Mixed Ancestry group in the Western Cape has admixture from the Khoi (Hottentot) and San (Bushman) population, with English, French Huguenot and Dutch input; a significant Indonesian (Malay) contribution; and, to a lesser extent, genetic contributions from indigenous Africans from Mozambique and Angola and persons from India and Ceylon.3

The aim of this study was to ascertain the extent of the previously proposed founder effect using a molecular genetic approach and to determine the origin of the mutation within these two South African population groups.

Materials and methods

Family triads selected for haplotyping

Initially, individuals representing 13 Mixed Ancestry and Caucasian families were selected for this study from the Human Genetics laboratory database. To the best of our knowledge, none of the families are related to the degree of first cousins. Aliquots of DNA samples were taken from the DNA bank and coded with a unique study number to protect patient confidentiality. All families of Mixed Ancestry were numbered MA1…n; Afrikaans-speaking Caucasian, AC1…n; and English-speaking Caucasian, EC1…n. Mutation status was confirmed by performing the molecular genetic test as described previously.13 DNA was isolated using a standard protocol.14 The study was approved by the UCT Human Ethics committee (reference 006/2004) and informed consent was obtained from all individuals.

Haplotype analysis

Six single-nucleotide polymorphisms (SNPs) were selected for the haplotype construction and chosen based on informativity from public databases with available genotyping data (Figure 1). At the time, only one informative SNP was found within the HD gene itself. In addition, HAPMAP showed that only SNP6 lies within the same haplotype block as the mutation. Controls were genotyped to ensure that all the SNPs were in Hardy–Weinberg equilibrium (Table 1). In addition, a single microsatellite marker, I1CAHD, located in intron 1 of the HD gene was genotyped in all HD patients. Primer pairs were designed for each of the selected polymorphisms chosen for the haplotype construction (Table 2).

Figure 1
figure 1

Map representing locations of SNPs with respect to HD and other genes in the region.

Table 1 Minor allele frequencies and heterozygosity information for SNPs 1–6 in the control group
Table 2 Primer sequences and genotyping details for the construction of haplotypes

PCR was optimized for each amplicon and performed for SNPs numbered 2, 3, 4, 5 and 6, as well as I1CAHD, in a total volume of 25 μl containing 200 ng of DNA; 10 pmol of each primer; 0.2 mM of each of the four dNTPs, 1 × PCR buffer, 0.5 unit of Taq polymerase. For SNP1, the PCR contained an additional 0.2 mM MgCl2, as well as 1% glycerol. Standard cycling conditions were used.

Restriction endonuclease digests were carried out according to the manufacturer's instructions. Restricted fragments were analyzed using 8% polyacrylamide gel electrophoresis (PAGE). SNP3 and SNP6 were genotyped using the two buffer PAGE system-based single-stranded polymorphism analysis as described previously.15 Analysis of SNP4 was performed using denaturing high-performance liquid chromatography, based on the WAVE technology procedure.16 Heteroduplexed PCR products were analyzed on the WAVE system (Transgenomic) using a melting temperature of 55.2°C on the basis of a previously described procedure.17 The PCR products for marker I1CAHD were genotyped by capillary-based electrophoresis on the ABI 3100 Genetic Analyzer (Applied Biosystems), followed by analysis using GeneScan version 3.7 software (Applied Biosystems).

Extended study using SNPHAP and FASTEHPLUS analyses

Additional samples from unrelated affected individuals were selected from the DNA bank and genotyped as above. Anonymous, unrelated control samples from each of the ethnic subgroups represented in the study cohort were genotyped to exclude the possibility of the common haplotype being present at a high frequency in the background population (Table 3). A haplotype assignment program, SNPHAP by Clayton (2002), was used to investigate the extent of the founder haplotype within the different ethnic subgroups and to infer haplotypes in the case of phase-unknown data. SNPHAP analysis was performed on the Mixed Ancestry and Caucasian (both Afrikaans and English) groups. In addition, the Caucasian group was stratified into Afrikaans-speaking Caucasians and English-speaking Caucasians according to language, which roughly indicates the country of origin. For each group, an input file was constructed containing genotyping data from one HD individual from each of the triad families, in addition to the unrelated HD individuals selected for the extended study. Furthermore, input files were created for each of the population groups mentioned above using unrelated controls. Therefore, genotyping data was divided into ethnic subgroups of individuals with disease and control individuals.

Table 3 Number of individuals represented in the extended haplotype study

The program FASTEHPLUS18 was adapted to find significant differences between the genotypes in the group of affected individuals and the group of controls in the Mixed Ancestry population. These were the same as those used for the SNPHAP program.

Results

Analysis of 13 HD triad family pedigrees as a whole revealed the presence of multiple haplotypes (Table 4). Within the families of the Mixed Ancestry group, a founder haplotype (haplotype 1) was observed to be tracking in four of the six family triads – MA1, MA2, MA3 and MA4. Haplotype 1 was also observed in two of the Afrikaans-speaking Caucasian families – AC3 and AC4. SNP haplotype 2 was identified in MA6 and AC1.

Table 4 Mixed Ancestry and Caucasian family haplotypes found to be associated with HD

Using the program SNPHAP, the proposed founder haplotype, haplotype 1, was shown to be the most probable haplotype in the extended HD group for the Mixed Ancestry population (Table 5). SNPHAP considered haplotype 1 in the control population but at a probability of less than 1 × 10−6, which further validates this as a haplotype associated with the disease and not simply a haplotype found at a high frequency in the control population (Table 5). SNPHAP indicated that for three of the additional families (subjects MA8, MA9 and MA10), haplotype 1 was the most likely assignment (Table 5). The same analysis was performed on an extended group of the Caucasian population; however, no common haplotypes were identified, even upon separation into different language groups (data not shown). The program FASTEHPLUS was used to make a direct comparison of the disease and control groups. The Mixed Ancestry population data was incorporated into an input file for the FASTEHPLUS program and a P-value of 0.0317 was calculated.

Table 5 SNPHAP data for the cohort of HD and controls from the Mixed Ancestry population groupa,b

The allele size range of the intronic microsatellite marker was shown to be from (CA)135 to (CA)173, which correlates well with the study by Moutou et al.19 The most common allele associated with the expanded HD chromosome is (CA)157 and is one of the most frequent alleles in both data sets. With regard to the triad studies, the (CA)157 allele was observed in families MA1, MA2, MA3 and MA4, as well as AC3 and AC4 – the same six families that share SNP haplotype 1.

Discussion

SNP genotyping analysis using the initial triad families showed multiple haplotypes within the Caucasian HD population group, but strongly suggests a common haplotype in the Mixed Ancestry group. The results indicate that there are at least six, but no more than seven, haplotypes across the 13 HD families that formed the study cohort (six Mixed Ancestry and seven Caucasian families).

Of the six Mixed Ancestry families, four (MA1–MA4) have been shown to share a common founder SNP haplotype referred to as haplotype 1. Interestingly, this is shared by two of the Afrikaans-speaking Caucasian families, that is, AC3 and AC4. The molecular finding that the HD gene was transmitted into the Mixed Ancestry population from the Afrikaans-speaking Caucasian group is supported by the knowledge of the general genetic contribution of the Dutch emigrants to this South African population group.4 The findings of this current study, using SNP markers, in conjunction with previous reports, thus confirms that the founder haplotype was indeed brought over from the Netherlands, and is seen in a significant fraction of the HD families in South Africa today. In addition, family MA6 has haplotype 2, which is shared by a Caucasian HD family (AC1), further reinforcing the extent of the genetic admixture of this population group. Although it is acknowledged that SNPs chosen within the same haplotype block would be ideal, retrospectively, analysis of the in-phase triad data shows that the entire length of the haplotype travelled with the disease in all the families analyzed. The inclusion of SNPs 1–5 lying outside the haplotype block incorporating the mutation, thus indicates the larger extent of the haplotype shared between families, reinforcing the fact that they share a mutation from a relatively recent, rather than ancient, ancestor.

The addition of a highly polymorphic marker (I1CAHD) located close to the HD CAG repeat was subsequently added to the haplotype profile to ascertain whether the addition of a microsatellite marker could reaffirm the SNP haplotypes. In the published literature, this marker is shown to have a heterozygosity of 85% and is the closest highly polymorphic marker to the HD gene, with the exception of the CCG repeat. Molecular analysis of this marker reaffirmed the initial findings of a founder effect in the triad studies using the SNP haplotypes, in that all the (CA)157 alleles taken from unrelated individuals in Table 4 were linked with disease-associated chromosomes. This marker is located closest to the mutation and might represent an older allele, either carrying the expansion or predisposed to expansions, whereas the SNPs further away might be subject to recombination, thus resulting in multiple haplotypes.

Bioinformatics programs showed that haplotype 1 is highly unlikely to occur in the background population. The low frequency of this haplotype on normal chromosomes reinforces that haplotype 1 is indeed only common in the disease population. Furthermore, analyses revealed that haplotype 1 is likely to be present in three out of an additional six Mixed Ancestry HD families, suggesting that this haplotype is common in this population group. These results were reinforced by a second program that showed that the difference in haplotypes existing between the cases and controls in the Mixed Ancestry population is significant. It is important to note that owing to smaller than usual sample sizes the power of this study is affected. However, the incorporation of the two computational analyses aids to reduce this occurring.

In conclusion, we have shown that, using SNP haplotype analysis and two computer software programs, the well-documented proposed South African HD founder effect does exist. A common haplotype is present in the Mixed Ancestry population with HD, which is clearly of the same origin as the original mutation brought over by the Dutch settler who was proposed to be responsible for HD in the majority of Afrikaans-speaking Caucasian HD families.11 Although this is not surprising, owing to the well-acknowledged admixture between the two groups, these findings serve to reinforce the extent of admixture between the South African populations and provides conclusive molecular evidence of the previously suggested founder effect of HD in South Africa.