Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2015 Aug 13;524(7564):225-9.
doi: 10.1038/nature14497. Epub 2015 Jun 29.

Identification of cis-suppression of human disease mutations by comparative genomics

Collaborators, Affiliations
Comparative Study

Identification of cis-suppression of human disease mutations by comparative genomics

Daniel M Jordan et al. Nature. .

Abstract

Patterns of amino acid conservation have served as a tool for understanding protein evolution. The same principles have also found broad application in human genomics, driven by the need to interpret the pathogenic potential of variants in patients. Here we performed a systematic comparative genomics analysis of human disease-causing missense variants. We found that an appreciable fraction of disease-causing alleles are fixed in the genomes of other species, suggesting a role for genomic context. We developed a model of genetic interactions that predicts most of these to be simple pairwise compensations. Functional testing of this model on two known human disease genes revealed discrete cis amino acid residues that, although benign on their own, could rescue the human mutations in vivo. This approach was also applied to ab initio gene discovery to support the identification of a de novo disease driver in BTG2 that is subject to protective cis-modification in more than 50 species. Finally, on the basis of our data and models, we developed a computational tool to predict candidate residues subject to compensation. Taken together, our data highlight the importance of cis-genomic context as a contributor to protein evolution; they provide an insight into the complexity of allele effect on phenotype; and they are likely to assist methods for predicting allele pathogenicity.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests. Readers are welcome to comment on the online version of the paper.

Figures

Extended Data Figure 1
Extended Data Figure 1. Different alignment methodologies with HumVar and ClinVar produce qualitatively similar alignments
a, b, Distributions of missense variants annotated as neutral (a) or pathogenic (b) in the HumVar and ClinVar data sets, with each of the five alignment strategies described in the text (MultiZ unfiltered, MultiZ mammals-only, EPO, MultiZ with alignment quality filter, MultiZ with >1 sequence filter). All distributions are quantitatively similar. Compare with Fig. 2c, d.
Extended Data Figure 2
Extended Data Figure 2. Protein domain structure of functionally tested human disease genes
a, Schematic of BBS4 (519 amino acids) is depicted with eight tetratricopeptide (TPR) domains (yellow); b, RPGRIP1L (1,315 amino acids) has multiple coiled-coil domains (green rectangles) and two protein kinase C conserved region 2 (C2) domains (green hexagons); and c, BTG2 (158 amino acids) has one BTG1 domain (purple pentagon). Disease-causing alleles are shown with red stars; complementing alleles are represented with blue stars; amino acid number scale in increments of 100 is shown below each schematic.
Extended Data Figure 3
Extended Data Figure 3. Evaluation ofbtg2 and nos2a/b MOs
ac, Schematic of the D. rerio btg2, nos2a and nos2b loci. Blue boxes, exons; dashed lines, introns; white boxes, untranslated regions; red boxes, MOs; ATG indicates the translational start site; arrows, polymerase chain reaction with reverse transcription (RT–PCR) primers; number indicates the targeted exon. d, e, Agarose gel images of nos2a/b RT–PCR products.
Extended Data Figure 4
Extended Data Figure 4. HuC/HuD staining and quantification of 2 dpf zebrafish embryos confirms pathogenicity of BTG2 V141M
a, Suppression of btg2 leads to a decrease of HuC/HuD levels at 2 dpf. Representative ventral images of control, btg2 morphants (images show unilateral or absent HuC/ HuD expression), and a rescued embryo injected with a btg2 MO plus human BTG2 wild-type (WT) mRNA. Scale bar, 250 μm. b, Percentage of embryos with normal, bilateral HuC/HuD protein levels in the anterior forebrain or decreased/unilateral HuC/HuD protein levels in embryos injected with btg2 MOs alone or MOs plus human BTG2 wild-type or variant mRNAs (p.V141M, index case; p.A126S and p.R145Q, control alleles). *P <0.05 (two-tailed t-test comparisons between MO-injected and rescued embryos; n =38–78 per injection batch).
Figure 1
Figure 1. Distribution of variants found in sequence alignments
a, Venn diagram showing sizes and overlap of the ClinVar and HumVar data sets, and how many are found in the multiple sequence alignment. b, Estimated number of human disease variants found in the alignment. The smallest estimate (3.0%, dark blue) comes from using the intersection of both variant data sets, requiring the variant to be absent from 6,503 human exomes, and filtering out alignments with low-quality scores. With any methodology, at least 88% of human disease variants (grey) are not found in the alignment. c, d, Potential mechanisms for the occurrence of CPDs in evolution. Branches where the variant is fixed are purple; branches where the variant is pathogenic are red. In c, the variant is present neutrally in an ancestor, but is lost in primates. Subsequent substitutions cause the ancestral allele to become pathogenic. In d, the variant is pathogenic in the ancestor, but mutations in a non-human branch cause it to become tolerated, and it arises later by mutation and becomes fixed.
Figure 2
Figure 2. Relationship between variants and evolutionary distance
a, Model for fixation of CPDs. Neutral changes (crosses) arise neutrally. Some of these (blue) compensate for alleles that would otherwise be pathogenic. The parameter k represents the number of compensatory changes required for each pathogenic allele. Once k compensatory changes have fixed, the CPD (red) fixes neutrally. b, The relationship between evolutionary distance and the number of variants in the alignment is expected to be different for individual benign variants (black) and pathogenic variants with different numbers of compensations (blue, one; red, two; cyan, five; magenta, ten). c, d, Observed distribution of missense variants annotated as neutral (c) or pathogenic (d) in HumVar and present in vertebrate orthologues (bars), with maximum likelihood fits (black lines) and 95% confidence bands (grey shading). Panel d corresponds to a fitted value for k of 1.44 ±0.07.
Figure 3
Figure 3. Compensatory mutations rescue pathogenic alleles in BBS4 and RPGRIP1L
a, The pathogenic BBS4 165H allele is fixed in six species. Secondary sites 160, 163 and 366 are possible CPDs. b, The pathogenic RPGRIP1L 937L allele is fixed in four species. The 189L, 193L and 961T alleles are present in all four species. c, Examples of zebrafish convergent extension phenotypic groups. d, Human RNA encoding the BBS4 165H mutation and either 366R or 366T can rescue the morphant phenotype; RNA encoding 165H mutation alone cannot. WT, wild type. e, Mutation of 189L, 193L or 961T, in the background of 937L RPGRIP1L mRNA, rescues the loss of function observed in 937L RNA. Significance was determined by χ2 test. See Supplementary Table 9 for embryo counts.
Figure 4
Figure 4. A de novo BTG2 p.V141M-encoding allele causes microcephaly
a, Pedigree DM048. Chromatograms show a de novo c.421G>A nucleotide change. WT, wild type. b, Suppression of btg2 leads to head size defects. Dorsal view of uninjected control and btg2 MO-injected zebrafish embryos at 4 dpf. White arrows show the distance measured from forebrain to hindbrain. Red line shows the protrusion of the pectoral fins in uninjected controls. c, Distribution of head size measurements at 4 dpf (Supplementary Table 10; white arrows in b), a.u., arbitrary units. d, 2 dpf zebrafish embryos stained for PH3. Human RNA containing the V141M mutation is unable to rescue the reduced proliferation of btg2 morphants. Scale bars, 250 μm. e, Quantification of PH3-positive cells: human RNA with mutations V141M and either R80K or L128V can rescue knockdown of btg2. Error bars represent standard deviation. f, The 141M allele is fixed in 59/87 species besides primates, examples displayed here. See Supplementary Table 11 for PH3 quantification.

References

    1. Alföldi J, Lindblad-Toh K. Comparative genomics as a tool to understand evolution and disease. Genome Res. 2013;23:1063–1068. - PMC - PubMed
    1. Cooper GM, Shendure J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nature Rev Genet. 2011;12:628–640. - PubMed
    1. Katsanis N, et al. BBS4 is a minor contributor to Bardet-Biedl syndrome and may also participate in triallelic inheritance. Am J Hum Genet. 2002;71:22–29. - PMC - PubMed
    1. Khanna H, et al. A common allele in RPGRIP1L is a modifier of retinal degeneration in ciliopathies. Nature Genet. 2009;41:739–745. - PMC - PubMed
    1. Adzhubei IA, et al. A method and server for predicting damaging missense mutations. Nature Methods. 2010;7:248–249. - PMC - PubMed

Publication types