Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Oct 9;4(10):100669.
doi: 10.1016/j.xgen.2024.100669.

Utilizing non-invasive prenatal test sequencing data for human genetic investigation

Affiliations

Utilizing non-invasive prenatal test sequencing data for human genetic investigation

Siyang Liu et al. Cell Genom. .

Abstract

Non-invasive prenatal testing (NIPT) employs ultra-low-pass sequencing of maternal plasma cell-free DNA to detect fetal trisomy. Its global adoption has established NIPT as a large human genetic resource for exploring genetic variations and their associations with phenotypes. Here, we present methods for analyzing large-scale, low-depth NIPT data, including customized algorithms and software for genetic variant detection, genotype imputation, family relatedness, population structure inference, and genome-wide association analysis of maternal genomes. Our results demonstrate accurate allele frequency estimation and high genotype imputation accuracy (R2>0.84) for NIPT sequencing depths from 0.1× to 0.3×. We also achieve effective classification of duplicates and first-degree relatives, along with robust principal-component analysis. Additionally, we obtain an R2>0.81 for estimating genetic effect sizes across genotyping and sequencing platforms with adequate sample sizes. These methods offer a robust theoretical and practical foundation for utilizing NIPT data in medical genetic research.

Keywords: NIPT-human-genetics workflow; allele frequency estimation; cell-free DNA; family relatedness; genome-wide association analysis; genotype imputation; low-pass whole-genome sequencing; non-invasive prenatal test; population structure; variant detection.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Characteristics of standard NIPT sequencing data (A) NIPT sequencing involves the sequencing and analysis of peripheral blood samples obtained from pregnant women. (B) Visualization of typical sequencing depth observed in clinical settings across various sequencing platforms commonly used in China, with data derived from the cohorts listed in Table S1. (C) The incorporation of NIPT sequencing into pregnancy screening programs in China and other countries has enabled the connection between genotypes and diverse maternal and child phenotypes. The word cloud illustrates the distribution of sample size for maternal and child phenotypes related to NIPT data from two Chinese hospitals, using statistics from https://monn.pheweb.com/phenotypes.html.
Figure 2
Figure 2
Assessment of call rate and allele frequency accuracy in relation to allele frequency on simulation data (A) Call rates for three datasets, each comprising 44,000, 140,000, and one million individuals, focusing on bi-allelic variants. (B) Call rates for 140,000 individuals across three allele types. (C) RMSD for allele frequency estimation. (D) Coefficient of variation for allele frequency estimation.
Figure 3
Figure 3
Imputation accuracy of NIPT samples compared to high-coverage true genomes for variants in chromosome 20 (A and B) Imputation accuracy attained by the QUILT algorithm (A) and the GLIMPSE algorithm (B) for variants in chromosome 20. The evaluation is performed against the reference panels of the 1KGP, BIGCS, and STROMICS using bcftools. The NIPT sequencing depth spans from 0.1× to 0.3×.
Figure 4
Figure 4
Family relatedness inference from NIPT data (A) Distribution of kinship coefficients for identical samples using PLINK without genotype imputation. (B) Distribution of kinship coefficients for identical samples using PLINK with genotype imputation. (C) Distribution of kinship coefficients for first-degree parent-offspring pairs using PLINK with genotype imputation. (D) Distribution of kinship coefficients for first-degree sister pairs using PLINK with genotype imputation. (E) Distribution of kinship coefficients for second-degree using PLINK with genotype imputation. (F) Visualization of relatedness using k0 and kinship coefficient.
Figure 5
Figure 5
PCA for 2,229 individuals with higher-depth whole-genome sequencing data using three approaches (A) PCA based on WGS data at 6.63×, considered the true PCA result. (B) PCA using PLINK on unimputed NIPT genotypes. (C) PCA using PLINK on imputed NIPT genotypes. (D) PCA using the EMU algorithm, which did not rely on exact genotypes.
Figure 6
Figure 6
Consistency of genetic effect estimates for height phenotype for NIPT data across different hospitals, between NIPT data and array, and across different NIPT sequencing platforms (A) Consistency of genetic effect estimates for height GWAS between two independent hospitals in Shenzhen for variants significantly associated with height in the meta-analysis. (B) Consistency of genetic effect estimates for height GWAS between an additional dataset from Shenzhen Baoan hospital and the meta-analysis dataset in (A). (C) Consistency of genetic effect estimates between array and the meta-analysis NIPT GWAS data in (A) for variants significantly associated with height in the Taiwan Biobank. (D) Consistency of genetic effect estimates between array-based and NIPT sequencing data for variants significantly associated with height in NIPT data. (E) Scatterplot illustrating the consistency of genetic effect estimates for 34 maternal metabolites association signals between the BGI-seq500 and BB platforms. (F) Scatterplot illustrating the consistency of genetic effect estimates for 13 neonatal metabolite associations between the Illumina and Ion Torrent platforms. Error bars indicate the standard errors of the genetic effect estimates.

References

    1. Claussnitzer M., Cho J.H., Collins R., Cox N.J., Dermitzakis E.T., Hurles M.E., Kathiresan S., Kenny E.E., Lindgren C.M., MacArthur D.G., et al. A brief history of human disease genetics. Nature. 2020;577:179–189. doi: 10.1038/s41586-019-1879-7. - DOI - PMC - PubMed
    1. Sirugo G., Williams S.M., Tishkoff S.A. The Missing Diversity in Human Genetic Studies. Cell. 2019;177:26–31. doi: 10.1016/j.cell.2019.02.048. - DOI - PMC - PubMed
    1. Abdellaoui A., Yengo L., Verweij K.J.H., Visscher P.M. 15 years of GWAS discovery: Realizing the promise. Am. J. Hum. Genet. 2023;110:179–194. doi: 10.1016/j.ajhg.2022.12.011. - DOI - PMC - PubMed
    1. Cheung S.W., Patel A., Leung T.Y. Accurate Description of DNA-Based Noninvasive Prenatal Screening. N. Engl. J. Med. 2015;372:1675–1677. doi: 10.1056/NEJMc1412222. - DOI - PubMed
    1. Zhang H., Gao Y., Jiang F., Fu M., Yuan Y., Guo Y., Zhu Z., Lin M., Liu Q., Tian Z., et al. Non-invasive prenatal testing for trisomies 21, 18 and 13: clinical experience from 146 958 pregnancies. Ultrasound Obstet. Gynecol. 2015;45:530–538. doi: 10.1002/uog.14792. - DOI - PubMed