Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 3;107(3):403-417.
doi: 10.1016/j.ajhg.2020.06.021. Epub 2020 Aug 4.

Interpretable Clinical Genomics with a Likelihood Ratio Paradigm

Affiliations

Interpretable Clinical Genomics with a Likelihood Ratio Paradigm

Peter N Robinson et al. Am J Hum Genet. .

Abstract

Human Phenotype Ontology (HPO)-based analysis has become standard for genomic diagnostics of rare diseases. Current algorithms use a variety of semantic and statistical approaches to prioritize the typically long lists of genes with candidate pathogenic variants. These algorithms do not provide robust estimates of the strength of the predictions beyond the placement in a ranked list, nor do they provide measures of how much any individual phenotypic observation has contributed to the prioritization result. However, given that the overall success rate of genomic diagnostics is only around 25%-50% or less in many cohorts, a good ranking cannot be taken to imply that the gene or disease at rank one is necessarily a good candidate. Here, we present an approach to genomic diagnostics that exploits the likelihood ratio (LR) framework to provide an estimate of (1) the posttest probability of candidate diagnoses, (2) the LR for each observed HPO phenotype, and (3) the predicted pathogenicity of observed genotypes. LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) placed the correct diagnosis within the first three ranks in 92.9% of 384 case reports comprising 262 Mendelian diseases, and the correct diagnosis had a mean posttest probability of 67.3%. Simulations show that LIRICAL is robust to many typically encountered forms of genomic and phenomic noise. In summary, LIRICAL provides accurate, clinically interpretable results for phenotype-driven genomic diagnostics.

Keywords: Human Phenotype Ontology; exome sequencing; genome sequencing; liklihood ratio; phenotype-driven genomic diagnostics.

PubMed Disclaimer

Conflict of interest statement

P.N.R. has filed a patent application based on this work.

Figures

Figure 1
Figure 1
LIRICAL Evaluation of a Simulated Case of Ataxia-Pancytopenia Syndrome (ATXPC) For each candidate diagnosis with an above-threshold posttest probability, LIRICAL shows the contribution of each phenotypic feature and of the genotype to the final diagnosis. In this case, the data were extracted from a published case report on an individual with ATXPC, and an additional unrelated term (high myopia) was added to simulate the effect of noise. (A) LIRICAL provides a table of the top candidates with the posttest probability and a sparkline view of the contributions of each HPO term and the relevant genotype. (B) The observed HPO terms. (C) The correct diagnosis, ATXPC, is ranked in first place because of a good phenotype match and a positive LR for the heterozygous genotype for the causative gene SAMD9L. (D) The second candidate has many of the same phenotype matches, but the first query term, dysmetria, matches exactly with Ataxia-pancytopenia syndrome and only approximately with the second candidate, spinocerebellar ataxia, autosomal recessive 7. (E) The third candidate has a posttest probability close to zero because it has more mismatching or poorly matching query terms.
Figure 2
Figure 2
Evaluation of LIRICAL and Exomiser on 384 Case Studies The case studies were formatted as phenopackets (Table 1), and the diagnostic process was simulated by spiking disease-causing variants into a VCF file, which was passed together with phenotype data to LIRICAL and Exomiser. (A) Simulation approach. Random noise terms were added to some simulations, and in some cases, terms were replaced by their parent term or grandparent term to mimic imprecision in measuring or recording phenotypic abnormalities. (B–G) Results of simulations are shown with the x axis showing the rank assigned by LIRICAL or Exomiser to the correct disease gene, and the y axis showing the percentage of cases in which the given rank was achieved. The following is shown: original data (B), performance on the subset of 221 autosomal-recessive cases (C), the same 221 autosomal-recessive cases in which one of the two pathogenic alleles was removed (D), two random (“noise”) HPO terms added to each case (E), original terms replaced by a parent term and two noise terms added (F), and original terms replaced by a grandparent term and two noise terms added (G).
Figure 3
Figure 3
Posttest Probability The posttest probability of the correct diagnosis was calculated for each of the 384 phenopacket case reports (original). Densities are shown for the original data (original; mean posttest probability, pp, 67.4%.); noise2∗∗, in which two random HPO terms were added and original terms were replaced by grandparent terms (mean pp, 50.3%); and random, in which all HPO terms were replaced by random terms (mean pp, 2.9%). Figure S7 shows results for other perturbations.
Figure 4
Figure 4
Performance of LIRICAL and Exomiser on 116 Solved Singleton Cases from the 100,000 Genomes Project The x axis shows the rank assigned by LIRICAL or Exomiser to the correct disease gene. The y axis shows the percentage of cases in which the given rank was achieved.
Figure 5
Figure 5
LIRICAL Evaluation of Simulated Case with a Pathogenic FBN1 Variant (A–E) Eight distinct diseases are associated with variants in FBN1. LIRICAL prioritizes each disease separately, and in this case correctly placed Marfan syndrome at rank #1. Three other FBN1-associated diseases were placed in ranks #2–#4 (A). Clinical and molecular data were simulated according to individual 1 in Cao et al. The HPO terms are shown in panel (B). The graphic shows LIRICAL’s summary table and three of the detailed LR plots for the candidates at ranks #1 (C), #3 (D), and #5 (E).

References

    1. Sifrim A., Popovic D., Tranchevent L.-C., Ardeshirdavani A., Sakai R., Konings P., Vermeesch J.R., Aerts J., De Moor B., Moreau Y. eXtasy: variant prioritization by genomic data fusion. Nat. Methods. 2013;10:1083–1084. - PubMed
    1. Singleton M.V., Guthery S.L., Voelkerding K.V., Chen K., Kennedy B., Margraf R.L., Durtschi J., Eilbeck K., Reese M.G., Jorde L.B. Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am. J. Hum. Genet. 2014;94:599–610. - PMC - PubMed
    1. Javed A., Agrawal S., Ng P.C. Phen-Gen: combining phenotype and genotype to analyze rare disorders. Nat. Methods. 2014;11:935–937. - PubMed
    1. Smedley D., Jacobsen J.O., Jäger M., Köhler S., Holtgrewe M., Schubach M., Siragusa E., Zemojtel T., Buske O.J., Washington N.L. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat. Protoc. 2015;10:2004–2015. - PMC - PubMed
    1. Miller N.A., Farrow E.G., Gibson M., Willig L.K., Twist G., Yoo B., Marrs T., Corder S., Krivohlavek L., Walter A. A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases. Genome Med. 2015;7:100. - PMC - PubMed

Publication types