. 2020 Sep 3;107(3):403-417.

doi: 10.1016/j.ajhg.2020.06.021. Epub 2020 Aug 4.

Interpretable Clinical Genomics with a Likelihood Ratio Paradigm

Peter N Robinson¹, Vida Ravanmehr², Julius O B Jacobsen³, Daniel Danis², Xingmin Aaron Zhang², Leigh C Carmody², Michael A Gargano², Courtney L Thaxton⁴; UNC Biocuration Core⁴; Guy Karlebach², Justin Reese⁵, Manuel Holtgrewe⁶, Sebastian Köhler⁶, Julie A McMurry⁷, Melissa A Haendel⁷, Damian Smedley³

Affiliations

¹ The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA. Electronic address: peter.robinson@jax.org.
² The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.
³ William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK.
⁴ Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA.
⁵ Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
⁶ Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany.
⁷ Oregon State University, Corvallis, OR 97331, USA.

PMID: 32755546
PMCID: PMC7477017
DOI: 10.1016/j.ajhg.2020.06.021

Interpretable Clinical Genomics with a Likelihood Ratio Paradigm

Peter N Robinson et al. Am J Hum Genet. 2020.

. 2020 Sep 3;107(3):403-417.

doi: 10.1016/j.ajhg.2020.06.021. Epub 2020 Aug 4.

Authors

Affiliations

¹ The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA. Electronic address: peter.robinson@jax.org.
² The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.
³ William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK.
⁴ Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA.
⁵ Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
⁶ Charité Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany.
⁷ Oregon State University, Corvallis, OR 97331, USA.

PMID: 32755546
PMCID: PMC7477017
DOI: 10.1016/j.ajhg.2020.06.021

Abstract

Human Phenotype Ontology (HPO)-based analysis has become standard for genomic diagnostics of rare diseases. Current algorithms use a variety of semantic and statistical approaches to prioritize the typically long lists of genes with candidate pathogenic variants. These algorithms do not provide robust estimates of the strength of the predictions beyond the placement in a ranked list, nor do they provide measures of how much any individual phenotypic observation has contributed to the prioritization result. However, given that the overall success rate of genomic diagnostics is only around 25%-50% or less in many cohorts, a good ranking cannot be taken to imply that the gene or disease at rank one is necessarily a good candidate. Here, we present an approach to genomic diagnostics that exploits the likelihood ratio (LR) framework to provide an estimate of (1) the posttest probability of candidate diagnoses, (2) the LR for each observed HPO phenotype, and (3) the predicted pathogenicity of observed genotypes. LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) placed the correct diagnosis within the first three ranks in 92.9% of 384 case reports comprising 262 Mendelian diseases, and the correct diagnosis had a mean posttest probability of 67.3%. Simulations show that LIRICAL is robust to many typically encountered forms of genomic and phenomic noise. In summary, LIRICAL provides accurate, clinically interpretable results for phenotype-driven genomic diagnostics.

Keywords: Human Phenotype Ontology; exome sequencing; genome sequencing; liklihood ratio; phenotype-driven genomic diagnostics.

PubMed Disclaimer

Conflict of interest statement

P.N.R. has filed a patent application based on this work.

Figures

**Figure 1**
LIRICAL Evaluation of a Simulated Case of Ataxia-Pancytopenia Syndrome (ATXPC) For each candidate diagnosis with an above-threshold posttest probability, LIRICAL shows the contribution of each phenotypic feature and of the genotype to the final diagnosis. In this case, the data were extracted from a published case report on an individual with ATXPC, and an additional unrelated term (high myopia) was added to simulate the effect of noise. (A) LIRICAL provides a table of the top candidates with the posttest probability and a sparkline view of the contributions of each HPO term and the relevant genotype. (B) The observed HPO terms. (C) The correct diagnosis, ATXPC, is ranked in first place because of a good phenotype match and a positive LR for the heterozygous genotype for the causative gene *SAMD9L*. (D) The second candidate has many of the same phenotype matches, but the first query term, dysmetria, matches exactly with Ataxia-pancytopenia syndrome and only approximately with the second candidate, spinocerebellar ataxia, autosomal recessive 7. (E) The third candidate has a posttest probability close to zero because it has more mismatching or poorly matching query terms.

**Figure 2**
Evaluation of LIRICAL and Exomiser on 384 Case Studies The case studies were formatted as phenopackets (Table 1), and the diagnostic process was simulated by spiking disease-causing variants into a VCF file, which was passed together with phenotype data to LIRICAL and Exomiser. (A) Simulation approach. Random noise terms were added to some simulations, and in some cases, terms were replaced by their parent term or grandparent term to mimic imprecision in measuring or recording phenotypic abnormalities. (B–G) Results of simulations are shown with the x axis showing the rank assigned by LIRICAL or Exomiser to the correct disease gene, and the y axis showing the percentage of cases in which the given rank was achieved. The following is shown: original data (B), performance on the subset of 221 autosomal-recessive cases (C), the same 221 autosomal-recessive cases in which one of the two pathogenic alleles was removed (D), two random (“noise”) HPO terms added to each case (E), original terms replaced by a parent term and two noise terms added (F), and original terms replaced by a grandparent term and two noise terms added (G).

**Figure 3**
Posttest Probability The posttest probability of the correct diagnosis was calculated for each of the 384 phenopacket case reports (original). Densities are shown for the original data (original; mean posttest probability, pp, $67.4 %$ .); noise2^∗∗, in which two random HPO terms were added and original terms were replaced by grandparent terms (mean pp, 50.3%); and random, in which all HPO terms were replaced by random terms (mean pp, 2.9%). Figure S7 shows results for other perturbations.

**Figure 4**
Performance of LIRICAL and Exomiser on 116 Solved Singleton Cases from the 100,000 Genomes Project The x axis shows the rank assigned by LIRICAL or Exomiser to the correct disease gene. The y axis shows the percentage of cases in which the given rank was achieved.

**Figure 5**
LIRICAL Evaluation of Simulated Case with a Pathogenic *FBN1* Variant (A–E) Eight distinct diseases are associated with variants in *FBN1*. LIRICAL prioritizes each disease separately, and in this case correctly placed Marfan syndrome at rank #1. Three other *FBN1*-associated diseases were placed in ranks #2–#4 (A). Clinical and molecular data were simulated according to individual 1 in Cao et al. The HPO terms are shown in panel (B). The graphic shows LIRICAL’s summary table and three of the detailed LR plots for the candidates at ranks #1 (C), #3 (D), and #5 (E).

See this image and copyright information in PMC

References

1. Sifrim A., Popovic D., Tranchevent L.-C., Ardeshirdavani A., Sakai R., Konings P., Vermeesch J.R., Aerts J., De Moor B., Moreau Y. eXtasy: variant prioritization by genomic data fusion. Nat. Methods. 2013;10:1083–1084. - PubMed
1. Singleton M.V., Guthery S.L., Voelkerding K.V., Chen K., Kennedy B., Margraf R.L., Durtschi J., Eilbeck K., Reese M.G., Jorde L.B. Phevor combines multiple biomedical ontologies for accurate identification of disease-causing alleles in single individuals and small nuclear families. Am. J. Hum. Genet. 2014;94:599–610. - PMC - PubMed
1. Javed A., Agrawal S., Ng P.C. Phen-Gen: combining phenotype and genotype to analyze rare disorders. Nat. Methods. 2014;11:935–937. - PubMed
1. Smedley D., Jacobsen J.O., Jäger M., Köhler S., Holtgrewe M., Schubach M., Siragusa E., Zemojtel T., Buske O.J., Washington N.L. Next-generation diagnostics and disease-gene discovery with the Exomiser. Nat. Protoc. 2015;10:2004–2015. - PMC - PubMed
1. Miller N.A., Farrow E.G., Gibson M., Willig L.K., Twist G., Yoo B., Marrs T., Corder S., Krivohlavek L., Walter A. A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases. Genome Med. 2015;7:100. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Interpretable Clinical Genomics with a Likelihood Ratio Paradigm

Affiliations

Interpretable Clinical Genomics with a Likelihood Ratio Paradigm

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical