Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb 1;24(3):599-608.
doi: 10.1093/hmg/ddu473. Epub 2014 Sep 12.

Prioritizing genes for X-linked diseases using population exome data

Affiliations

Prioritizing genes for X-linked diseases using population exome data

Xiaoyan Ge et al. Hum Mol Genet. .

Abstract

Many new disease genes can be identified through high-throughput sequencing. Yet, variant interpretation for the large amounts of genomic data remains a challenge given variation of uncertain significance and genes that lack disease annotation. As clinically significant disease genes may be subject to negative selection, we developed a prediction method that measures paucity of non-synonymous variation in the human population to infer gene-based pathogenicity. Integrating human exome data of over 6000 individuals from the NHLBI Exome Sequencing Project, we tested the utility of the prediction method based on the ratio of non-synonymous to synonymous substitution rates (dN/dS) on X-chromosome genes. A low dN/dS ratio characterized genes associated with childhood disease and outcome. Furthermore, we identify new candidates for diseases with early mortality and demonstrate intragenic localized patterns of variants that suggest pathogenic hotspots. Our results suggest that intrahuman substitution analysis is a valuable tool to help prioritize novel disease genes in sequence interpretation.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Putative LoF variants are relatively depleted on the X chromosome in exome data analysis. (A) Ratio of total non-synonymous variants versus total synonymous variants for each chromosome. P-value = 4.61e–05 (Wilcoxon signed ranked test). (B) Ratio of total LoF variants versus total synonymous variants for each chromosome. P-value = 4.01e–05 (Wilcoxon signed ranked test). Variants were analyzed from ESP. X-chromosome is indicated in red and autosomes in black.
Figure 2.
Figure 2.
Flow chart. All variants from ESP were filtered using all X-linked genes and by variant type, analyzed for prediction and validation.
Figure 3.
Figure 3.
Comparison of dN/dS ratio by disease characteristics. dN/dS ratio of X-linked genes that are (A) OMIM disease genes (OMIM) or genes not yet annotated in disease (non-OMIM). P-value = 6.38e–05 (two-sample Wilcoxon test). (B) Disease genes with different average age of disease onset: childhood, adulthood and variable. P-value = 0.6834 (Kruskal–Wallis Rank Sum test). (C) Disease genes with different average age of death: childhood, adulthood and variable. P-value = 0.02972 (Kruskal–Wallis Rank Sum test).
Figure 4.
Figure 4.
Pathogenic variants occur at protein locations that have the least non-synonymous variant density. From top to bottom: (A) density plots for synonymous and non-synonymous variants along the coding sequence of six representative genes. X-axis shows the relative position of the synonymous and non-synonymous variants in the coding sequence. Green: synonymous. Red: non-synonymous. (B) Histograms show the pathogenic missense variants along the coding sequences. (C) Domain structures: dark gray rectangles show the protein domains and light gray bars show the whole protein. ATRX: NM_000489; CDKL5: NM_003159; F8: NM_000132; HCFC1: NM_005334; KDM5C: NM_004187; MECP2: NM_001110792.

References

    1. Tennessen J.A., Bigham A.W., O'Connor T.D., Fu W., Kenny E.E., Gravel S., McGee S., Do R., Liu X., Jun G., et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012;337:64–69. - PMC - PubMed
    1. MacArthur D.G., Balasubramanian S., Frankish A., Huang N., Morris J., Walter K., Jostins L., Habegger L., Pickrell J.K., Montgomery S.B., et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science. 2012;335:823–828. - PMC - PubMed
    1. Xue Y., Chen Y., Ayub Q., Huang N., Ball E.V., Mort M., Phillips A.D., Shaw K., Stenson P.D., Cooper D.N., et al. Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing. Am. J. Hum. Genet. 2012;91:1022–1032. - PMC - PubMed
    1. Li M.X., Kwan J.S., Bao S.Y., Yang W., Ho S.L., Song Y.Q., Sham P.C. Predicting Mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies. PLoS Genet. 2013;9:e1003143. - PMC - PubMed
    1. Cooper G.M., Goode D.L., Ng S.B., Sidow A., Bamshad M.J., Shendure J., Nickerson D.A. Single-nucleotide evolutionary constraint scores highlight disease-causing mutations. Nat. Methods. 2010;7:250–251. - PMC - PubMed

Publication types