Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Oct;36(10):998-1003.
doi: 10.1002/humu.22847.

Mitigating false-positive associations in rare disease gene discovery

Affiliations

Mitigating false-positive associations in rare disease gene discovery

Sebastian Akle et al. Hum Mutat. 2015 Oct.

Abstract

Clinical sequencing is expanding, but causal variants are still not identified in the majority of cases. These unsolved cases can aid in gene discovery when individuals with similar phenotypes are identified in systems such as the Matchmaker Exchange. We describe risks for gene discovery in this growing set of unsolved cases. In a set of rare disease cases with the same phenotype, it is not difficult to find two individuals with the same phenotype that carry variants in the same gene. We quantify the risk of false-positive association in a cohort of individuals with the same phenotype, using the prior probability of observing a variant in each gene from over 60,000 individuals (Exome Aggregation Consortium). Based on the number of individuals with a genic variant, cohort size, specific gene, and mode of inheritance, we calculate a P value that the match represents a true association. A match in two of 10 patients in MECP2 is statistically significant (P = 0.0014), whereas a match in TTN would not reach significance, as expected (P > 0.999). Finally, we analyze the probability of matching in clinical exome cases to estimate the number of cases needed to identify genes related to different disorders. We offer Rare Disease Match, an online tool to mitigate the uncertainty of false-positive associations.

Keywords: Matchmaker Exchange; false-positive associations; incidental findings; incidentalome; matchmaking; rare diseases.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The probability of observing a single homozygous variant match or a single compound heterozygous variant match in a set of patients with F=0.0016. We measure the probability of observing a single homozygous variant match [a] or a single compound heterozygous variant match [b] in cohorts of 2, 10, and 100 individuals with the same phenotype, sorted by gene rank. Genes are ranked in descending order of rare, non-synonymous variation using data from the Exome Aggregation Consortium. The probability for each curve is shown as the –log base 10 of the p-value of observing a single match, meaning that a match between 2 individuals is more significant than one among 10 or 100 individuals. Matches in genes that harbor a great deal of rare variation (low rank) are also less significant than those in genes that less commonly have rare variants (high rank).
Figure 2
Figure 2
The probability of observing a single homozygous alternate variant match or a single compound heterozygous variant match in 10 patients, using three different population inbreeding coefficients. We measure the probability of observing a single homozygous alternate variant match [blue/light] or a single compound heterozygous variant match [red/dark] using estimated population inbreeding coefficients from Italy [a], Japan [b], and Andhra Pradesh, India [c]. Genes are ranked in descending order of rare, non-synonymous variation using data from the Exome Aggregation Consortium. The probability for each curve is presented as the –log base 10 of the p-value of observing a single match, and we observe that the point at which these two curves intersect is determined by the inbreeding coefficient, and it occurs at lower rank numbers as the inbreeding coefficient increases. In genes that harbor very low numbers of rare, non-synonymous variants, it is more likely to observe a homozygous recessive variant due to a region that is identical by descent rather than resulting from a chance match from the general population. The jaggedness of the line corresponding to recessives comes from the fact that genes are ordered by rank in frequency of mutations, which is highly correlated but slightly different than the frequency of homozygotes under Hardy-Weinberg Equilibrium.
Figure 3
Figure 3
Trade-offs between quality of variant association and quality of phenotype match.

References

    1. 2014. (ExAC) EAC.
    1. Berg JS, Adams M, Nassar N, Bizon C, Lee K, Schmitt CP, Wilhelmsen KC, Evans JP. An informatics approach to analyzing the incidentalome. Genet Med. 2012 - PMC - PubMed
    1. Brownstein CA, Beggs AH, Homer N, Merriman B, Yu TW, Flannery KC, Dechene ET, Towne MC, Savage SK, Price EN, Holm IA, Luquette LJ, et al. An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge. Genome Biol. 2014;15:R53. - PMC - PubMed
    1. Cassa CA, Tong MY, Jordan DM. Large numbers of genetic variants considered to be pathogenic are common in asymptomatic individuals. Hum. Mutat. 2013;34:1216–1220. - PMC - PubMed
    1. Collection S. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 2014:1–95. - PubMed

Publication types