Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov 5;10(1):19142.
doi: 10.1038/s41598-020-76245-5.

Genetic profiling of Vietnamese population from large-scale genomic analysis of non-invasive prenatal testing data

Affiliations

Genetic profiling of Vietnamese population from large-scale genomic analysis of non-invasive prenatal testing data

Ngoc Hieu Tran et al. Sci Rep. .

Abstract

The under-representation of several ethnic groups in existing genetic databases and studies have undermined our understanding of the genetic variations and associated traits or diseases in many populations. Cost and technology limitations remain the challenges in performing large-scale genome sequencing projects in many developing countries, including Vietnam. As one of the most rapidly adopted genetic tests, non-invasive prenatal testing (NIPT) data offers an alternative untapped resource for genetic studies. Here we performed a large-scale genomic analysis of 2683 pregnant Vietnamese women using their NIPT data and identified a comprehensive set of 8,054,515 single-nucleotide polymorphisms, among which 8.2% were new to the Vietnamese population. Our study also revealed 24,487 disease-associated genetic variants and their allele frequency distribution, especially 5 pathogenic variants for prevalent genetic disorders in Vietnam. We also observed major discrepancies in the allele frequency distribution of disease-associated genetic variants between the Vietnamese and other populations, thus highlighting a need for genome-wide association studies dedicated to the Vietnamese population. The resulted database of Vietnamese genetic variants, their allele frequency distribution, and their associated diseases presents a valuable resource for future genetic studies.

PubMed Disclaimer

Conflict of interest statement

NHT, TBV, HATP, THTD, NMN, YLTV, VUT, HGV, QTNB, PANV, HNN, HG and MDP are current employees of Gene Solutions, Vietnam. The other authors declare no competing interests.

Figures

Figure 1
Figure 1
Distributions of genome coverage and sequencing depth of the NIPT dataset. (a) Average genome coverage and sequencing depth per sample and from all samples combined. (b) Summary histogram of sequencing depth over all genome positions. (c) Distribution of sequencing depth per chromosome. (d) IGV tracks of sequencing depth, bwa MAPQ score, and Umap k50 mappability across the whole genome (the figure was produced using IGV, Integrative Genomics Viewer, version 2.8.9).
Figure 2
Figure 2
Summary of the NIPT call set. (a) Venn diagram comparison between the NIPT call set, the KHV and EAS call sets from the 1000 genomes project, and the dbSNP database. The percentages were calculated with respect to the NIPT call set. (b) Allele frequency distribution of the NIPT call set. (c) Distribution of locations and effects of variants in the NIPT call set.
Figure 3
Figure 3
Principal component analysis of the NIPT call set and other East Asia populations. (a) Scatter plot comparison of allele frequency estimated from the NIPT and the KHV call sets. (b) Principal component analysis. (JPT Japanese in Tokyo, Japan, CHB Han Chinese in Beijing, China, CHS Southern Han Chinese, CDX Chinese Dai in Xishuangbanna, China, KHV Kinh in Ho Chi Minh City, Vietnam, NIPT Non-Invasive Prenatal Testing data).

References

    1. The 1000 Genomes Project Consortium A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. - DOI - PMC - PubMed
    1. Lek M, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–291. doi: 10.1038/nature19057. - DOI - PMC - PubMed
    1. Karczewski KJ, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. doi: 10.1038/s41586-020-2308-7. - DOI - PMC - PubMed
    1. The UK10K Consortium The UK10K project identifies rare variants in health and disease. Nature. 2015;526:82–90. doi: 10.1038/nature14962. - DOI - PMC - PubMed
    1. Gudbjartsson DF, et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 2015;47:435–444. doi: 10.1038/ng.3247. - DOI - PubMed

Publication types