INTRODUCTION
The Andean highlands of South America have long been considered a natural laboratory for the study of genetic adaptation of humans (
1), yet the genetics of Andean highland populations remain poorly understood. People likely entered the highlands shortly after their arrival on the continent (
2,
3), and while some have argued that humans lived permanently in the Central Andean highlands by 12,000 before the present (BP) (
4), other research indicates that permanent occupation began between 9500 and 9000 BP (
5,
6). Regardless of when the peopling of high-elevation environments began, selective pressure on the human genome was likely strong–not only because of challenging environmental factors, but also because social processes such as intensification of subsistence resources and residential sedentism (
7) promoted the development of agricultural economies, social inequality, and relatively high population densities across much of the highlands. European contact initiated an array of economic, social, and pathogenic changes (
8). Although it is known that the peoples of the Andean highlands experienced population contraction after contact (
8), its extent is debated from both archeological and ethnohistorical perspectives, especially concerning the size of the indigenous populations at initial contact (
9).
To address hypotheses regarding the population history of Andean highlanders and their genetic adaptations, we collected a time series of ancient whole genomes from individuals in the Lake Titicaca region of Peru (fig. S1 and
Table 1). The series represents three different cultural periods, which include individuals from (i) Rio Uncallane, a series of cave crevice tombs dating to ~1800 BP and used by fully sedentary agriculturists, (ii) Kaillachuro, a ~3800-year-old site marked by the transition from mobile foraging to agropastoralism and residential sedentism, and (iii) Soro Mik’aya Patjxa (SMP), an 8000- to 6500-year-old site inhabited by residentially mobile hunter-gatherers (
10). We then compare the genomes of these ancient individuals to 25 new genomes and 39 new genome-wide single-nucleotide polymorphism (SNP) datasets generated from two modern indigenous populations: the Aymara of highland Bolivia and the Huilliche-Pehuenche of coastal lowland Chile. The Aymara are an agropastoral people who have occupied the Titicaca basin for at least 2000 years (
11). The Huilliche-Pehuenche are traditionally hunter-gatherers from the southern coastal forest of Chile (
12).
To explore the population history of the Andean highlands, we first assess the genetic affinities of the prehistoric individuals and compare them to modern South Americans and other ancient Native Americans. Second, we construct a demographic model that estimates the timing of the lowland-highland population split as well as the population collapse following European contact, and last, we explore evidence for genetic changes associated with selective pressures associated with the permanent occupation of the highlands, the intensification of tuber usage, and the impact of European-borne diseases.
DISCUSSION
South America is thought to have been populated relatively soon after the first human entry into the Americas, some 15,000 years ago (
3). However, steep environmental gradients in western South America would have posed substantial challenges to population expansion. Among the harshest of these environments are the Andean highlands, which boast frigid temperatures, low partial pressure of oxygen, and intense ultraviolet radiation. Despite this, however, humans eventually spread throughout the Andes and occupied them permanently. Archeological evidence suggests that hunter-gatherers entered the highlands as early as 12,000 years BP (
4), with permanent occupation beginning around 9000 years BP (
5–
7). The evidence presented here indicates genetic affinity between populations from different time periods in the high-elevation Lake Titicaca region from at least 3800 years BP, and possibly 7000 years BP. This affinity extends to the present high-altitude Andean communities of the Aymara and Quechua.
Although our samples do not extend beyond 7000 years BP, we were able to model the initial entry into the Andes after the split between North and South American groups. Our model, using a mutation rate of 1.25 × 10
−8, shows a correlation with archeological evidence regarding the split between North and South groups occurring nearly 14,750 years ago (95% CI, 14,225 to 15,775), which agrees with the oldest known site in South America of Monte Verde in southern Chile (~14,000 years BP) (
3). The date for the split between low- and high-altitude populations was inferred to 8750 years (95% CI, 8200 to 9250), which is younger than previously reported by a study using modern genomes alone (
40). This date provides a terminus ante quem time frame for the origins of adaptations known in modern highland populations.
We also present evidence for genes that may have been under selective pressure caused by environmental stressors in the Andes. None of our most extreme signals for positive selection were related to the hypoxia pathway. Instead, we find differentiated SNPs in the
DST gene, which has been linked to the proper formation of cardiac muscle in mice (
35,
41). Furthermore, the
DST intronic SNP that was most differentiated (rs149112613) shows histone modifications associated with blood and the right ventricle of the heart (
38). This correlates to Andean highlanders tending to have enlarged right ventricles associated with moderate pulmonary hypertension (
1). This finding also parallels hypotheses proposed by Crawford
et al. (
42) that Andeans may have adapted to high-altitude hypoxia via cardiovascular modifications.
The most extreme signal may represent adaptations to an agricultural subsistence and diet. The top-ranked gene,
MGAM, is associated with starch digestion (
43). The associated high-frequency SNPs in the ancient Andean population (table S4) exhibit chromatin marks in cells from the gastrointestinal tract (
Fig. 5A). The variant may be highly differentiated between the ancient Andeans and the lowlanders (the Huilliche-Pehuenche) because of differences in subsistence strategies. The Huilliche-Pehuenche individuals are traditionally hunter-gatherers, with archeological evidence suggesting that their ancestors have been practicing this mode of subsistence for thousands of years in the region before European contact in the 1500s (
44). In contrast, the Andes is one of the oldest New World centers for agriculture, which included starch-rich plants such as maize (~4000 years BP) (
45) and the potato (~3400 years BP) (
7). Selection acting on the
MGAM gene in the ancient Andeans may represent an adaptive response to greater reliance upon starchy domesticates. Recent archeological findings based on dental wear patterns and microbotanical remains similarly suggest that intensive tuber processing and thus selective pressures for enhanced starch digestion began at least 7000 years ago (
7,
32). Furthermore, we see a similar signal (top 0.01%) when we contrast the hunter-gatherers from Brazil [Karitiana/Surui, sequence data (
46)] with the ancient Andeans, as well as with the Aymara versus the Huilliche-Pehuenche and the Karitiana/Surui. One further note, we did not detect amylase high copy number in the ancient Andes population before European contact, suggesting a different evolutionary path for starch digestion in the Andes when contrasted with Europeans (
47).
Selection with respect to the environment in the Andes is not limited to the ancient past. In 1532, the environment radically changed with the arrival of the Spanish (
29). Not only were long-standing states and social organizations disrupted, but also the environment itself was altered with the arrival of European-introduced pathogens, which may have preceded the arrival of the Spanish via trade routes (
29). Some of the most devastating epidemics were related to smallpox, occurring in the 1500s and 1600s (
8). These combined factors are thought to have decimated the local populations (
8). We inferred the population decline in the Andes, using the ancient and modern Andeans, and found the decline in effective population size to be 27% (95% CI, 0.23 to 0.34). We also simulated DNA sequence data immediately before the collapse between the Rio Uncallane and the Aymara using a truncated model and found a reduction in average heterozygosity of 23% (see the Supplementary Materials). This is a modest decline compared to archeological and historical estimates, which reached upward of 90% of the total population (
48). In contrast, the model infers a much more severe collapse for lowland Andean populations of the Huilliche-Pehuenche, exceeding 90%. We also explored alternative scenarios to make sure that the model was not biasing the inferred collapse and found that the estimated population size reduction remained significantly less severe in the highland compared to the lowland populations (see the Supplementary Materials). Although we did not have precontact ancient samples for these populations in Chile to inform the model, the large difference suggests that high-altitude populations may have suffered a less intense decline compared to the more easily accessible populations near sea level in the Andes region. This is also supported by long-lasting warfare in the Chilean lowland region with the Spanish that lasted well into the 19th century (
29).
Our data also show that the populations in the modern Andes have high genetic affinity with the ancient populations preceding European contact. Although a strict continuity test (
49) that does not allow for recent gene flow was not significant (see the Supplementary Materials), modern Andeans are likely the descendants of the people that suffered the epidemics described in historical texts. In the Andes, missionary reports suggest that disease may have arrived before formal Spanish contact in 1532 (
50) and that the first epidemics were likely caused by smallpox (
29). We infer that selection acted within the past 500 years on the immune response, making it likely that modern Andeans descend from the survivors of these epidemics. The selection scan along the branch of the modern Andeans, contrasted with the ancient group, revealed the strongest signal to be associated with an immune gene connected to smallpox,
CD83 (
37). The second most highly differentiated SNPs were in the vicinity of
RPS29, which codes for a ribosomal protein and is involved in viral mRNA translation and metabolism, including those of influenza (
51). Another top gene,
IL-36R (rank #36), is thought to have evolved alternative cytokine signaling to compensate for viruses, such as pox viruses, that can evade the immune system (
52). Furthermore, the top SNP associated with
IL-36R (rs1117797) exhibits a QTL (quantitative trait locus) signal associated with
IL18RAP, a gene involved in mediating the immune response to the vaccinia virus (
53). The relative strength of these signals and the role played by the associated genes may indicate that selection favored alleles that directly affected the pathogenicity of the diseases encountered by the ancestors of the epidemic survivors.
In conclusion, human adaptation to the Andean highlands involved a variety of factors and was complicated by the arrival of Europeans and the marked changes that followed. Despite harsh environmental factors, the Andes were populated relatively early after entry into the continent. The adaptive traits necessary for permanent occupation may have been selected for in a relatively short amount of time, on the order of a few thousand years. Given the multifaceted nature of the adaptation, we are not surprised to find genetic affinity in the populations of the Andes dating to at least 4000 years BP, and possibly extending to 7000 years BP.
Acknowledgments
We would like to thank J. Blangero and S. Blangero for contribution with the Aymara samples. We would like to thank C. Jeong and S. Nakagome for helpful discussions throughout the project. We would like to thank M. DeGiorgio for helping with the ƒ3 statistic analysis. Field support for R.H. was provided by the Collasuyo Archaeological Research Institute, C. Justo Chavez, V. Incacoña Huaraya, M. Incacoña Huaraya, N. Condori Flores, A. Pilco Quispe, D. Pilco Incacoña, D. Pilco Incacoña, K. Pilco Incacoña, L. M. Pilco Incacoña, L. Hayes, and the community of Mulla Fasiri, Peru. Archeological data recovery at SMP and international export of artifacts, including those from Jiskairumoko, Kaillachuro, and the Rio Uncallane sites, were carried out under Peruvian Ministry of Culture permit nos. 064-2013-DGPA-VMPCIC/MC and 138-2015-VMPCIC/MC. Multiple permits were issued by the Peruvian Instituto Nacional de Cultura for research at Jiskairumoko, Kaillachuro, and the Rio Uncallane sites from 1994 to 2002. See Moreno-Mayar et al. (67) and Posth et al. (68) for related analyses of ancient DNA samples from the Americas. Funding: This work was supported in part by NIH grant R01HL119577 and the National Science Foundation grants BCS-1528698 (awarded to M.Al., C.W., and A.D.R.) and BSC-9221724 (awarded to C.B.). J.L. was funded by a University of Chicago Provost’s Postdoctoral Scholarship. Support for archeological excavation and artifact analysis was provided to R.H. by the National Science Foundation (BCS-1311626), the American Philosophical Society, and the University of Arizona. Survey and data recovery at the Rio Uncallane sites was supported by grants to M.Al. from the National Geographic Society (5245-94) and the H. John Heinz III Charitable Trust. Excavations at Jiskairumuko and Kaillachuro were supported by grants to M.Al. from the National Science Foundation (SBR-9816313 and SBR-9978006). The Huilliche-Pehuenche datasets were funded by the Chilean National Council of Science and Technology (CONICYT) grants USA2013-0015 and FONDEF D10I1007. Author contributions: M.Al., R.H., C.B., J.T.W., and C.V.L. provided samples for the study. J.L., A.D.R., C.W., and C.H. contributed to the experimental design. M.Ap., J.L., A.D.R., J.N., and D.W. analyzed data. M.M. and R.A.V. generated the Huilliche-Pehuenche datasets. J.L., A.D.R., and M.Al. wrote the initial draft of the manuscript. C.W. and R.H. contributed to the writing of the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors. Ancient and modern DNA sequences are available from NCBI Sequence Read Archive, accession no. PRJNA470966. The Huilliche-Pehuenche SNP data will be available via a data access agreement with R.A.V. at the Universidad de Chile.