Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 14;9(1):2132.
doi: 10.1038/s41598-019-39391-z.

Whole Genome Sequence, Variant Discovery and Annotation in Mapuche-Huilliche Native South Americans

Affiliations

Whole Genome Sequence, Variant Discovery and Annotation in Mapuche-Huilliche Native South Americans

Elena A Vidal et al. Sci Rep. .

Abstract

Whole human genome sequencing initiatives help us understand population history and the basis of genetic diseases. Current data mostly focuses on Old World populations, and the information of the genomic structure of Native Americans, especially those from the Southern Cone is scant. Here we present annotation and variant discovery from high-quality complete genome sequences of a cohort of 11 Mapuche-Huilliche individuals (HUI) from Southern Chile. We found approximately 3.1 × 106 single nucleotide variants (SNVs) per individual and identified 403,383 (6.9%) of novel SNVs events. Analyses of large-scale genomic events detected 680 copy number variants (CNVs) and 4,514 structural variants (SVs), including 398 and 1,910 novel events, respectively. Global ancestry composition of HUI genomes revealed that the cohort represents a sample from a marginally admixed population from the Southern Cone, whose main genetic component derives from Native American ancestors. Additionally, we found that HUI genomes contain variants in genes associated with 5 of the 6 leading causes of noncommunicable diseases in Chile, which may have an impact on the risk of prevalent diseases in Chilean and Amerindian populations. Our data represents a useful resource that can contribute to population-based studies and for the design of early diagnostics or prevention tools for Native and admixed Latin American populations.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Genetic and structural variants in Mapuche-Huilliche genomes. Circos plot of the spatial distribution of SNV densities (i), deletions and insertions (ii), structural variant (SV) loses and gains (iii), copy number variant (CNV) losses and gains (iv), inversions (v) and translocations (vi). Light or dark colors in different tracks indicate known or novel variants, respectively. Tandem (red lines) and distal duplications (blue arrows) are shown within the inner circle of the plot. Translocation events are shown as green arrows.
Figure 2
Figure 2
Ancestry analysis of HUI and Chilean Latino individuals. (A) ADMIXTURE plots for K = 5 (Continental model) and K = 10 (minimum error model). All 3,706 samples included are depicted as vertical thin bars colored by their corresponding ancestry percentage. HUI genomes are highlighted at the left with thicker bars followed by Chilean Latino genotyped individuals and samples included in 1kGP-phase 3, which are clustered in 5 super-populations (AMR, EUR, EAS, SAS and AFR). For K = 5, the colors were defined as follows: Red for “Amerindian”, yellow for “European”, blue for “East Asian”, green for “South Asian” and purple for “African”. For K = 10, light colors are used to show subcomponents within super-populations EUR, EAS, SAS and AFR. Grey color is used to represent the AMR component common to PEL, MXL, CLM and PUR populations but almost absent in HUI. Bottom thick bars define key colors used in the PCA. (B) Principal Component (PC) analysis including the same set of samples (colored dots) and markers. Color legend and number of samples belonging to each super population defined in (A) is provided in the legend inside brackets. Left Panel: PC1 vs. PC2, right panel: PC3 vs. PC4. Percentage of variance explained by each component is given in parenthesis in the corresponding axis.
Figure 3
Figure 3
Analysis of the genetic distance (Fst) between HUI cohort and 1kGP-phase 3 population. (A) World map showing all 26 populations from 1kGP-phase 3 coming from the 5 super populations (AFR, SAS, EAS, EUR and AMR) and their Weir and Cockerham’s Fst statistic (weighted Fst) from yellow to red according to their genetic distance obtained from the comparison with the HUI sequenced individuals. This figure was created on Adobe Illustrator® CS5 (https://www.adobe.com/) based on a figure made available under the Creative Commons CC0 1.0 Universal Public Domain Dedication (Blank map of the world Equirectangular, https://en.wikipedia.org/wiki/File:BlankMap-World6-Equirectangular.svg) (B) Violin plots comparing SNV density between HUI and other 26 populations from 1kGP-phase 3. Fst distributions are sorted by decreasing genetic distance from HUI (top to bottom). Vertical bars on each population plot indicate 95th percentile cutoff. SuperPop = Super populations from 1kGP-phase 3: AFR = Africans, AMR = Admixed Americans, ASN = Asians, EUR = Europeans.

References

    1. Altshuler DM, et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52–58. doi: 10.1038/nature09298. - DOI - PMC - PubMed
    1. Altshuler DM, et al. A global reference for human genetic variation. Nature. 2015;526:68-+. doi: 10.1038/nature15393. - DOI - PMC - PubMed
    1. Lek M, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–91. doi: 10.1038/nature19057. - DOI - PMC - PubMed
    1. Harris, K. & Pritchard, J. K. Rapid evolution of the human mutation spectrum. Elife6 (2017). - PMC - PubMed
    1. Rasmussen M, et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature. 2010;463:757–62. doi: 10.1038/nature08835. - DOI - PMC - PubMed

Publication types

Substances