Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 5:3:5.
doi: 10.1038/s41525-018-0044-9. eCollection 2018.

A phenotype centric benchmark of variant prioritisation tools

Affiliations

A phenotype centric benchmark of variant prioritisation tools

Denise Anderson et al. NPJ Genom Med. .

Abstract

Next generation sequencing is a standard tool used in clinical diagnostics. In Mendelian diseases the challenge is to discover the single etiological variant among thousands of benign or functionally unrelated variants. After calling variants from aligned sequencing reads, variant prioritisation tools are used to examine the conservation or potential functional consequences of variants. We hypothesised that the performance of variant prioritisation tools may vary by disease phenotype. To test this we created benchmark data sets for variants associated with different disease phenotypes. We found that performance of 24 tested tools is highly variable and differs by disease phenotype. The task of identifying a causative variant amongst a large number of benign variants is challenging for all tools, highlighting the need for further development in the field. Based on our observations, we recommend use of five top performers found in this study (FATHMM, M-CAP, MetaLR, MetaSVM and VEST3). In addition we provide tables indicating which analytical approach works best in which disease context. Variant prioritisation tools are best suited to investigate variants associated with well-studied genetic diseases, as these variants are more readily available during algorithm development than variants associated with rare diseases. We anticipate that further development into disease focussed tools will lead to significant improvements.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Fig. 1
Fig. 1
Heatmaps showing auROC (a) and auPRC (b) values for the 4026 HPO ‘Phenotypic abnormality’ terms when using Phenolyzer gene panels with no score threshold. Right-hand plots show the top level ontology (HP:0000118 ‘Phenotypic abnormality’) and broad child terms of ‘Phenotypic abnormality’. Left-hand plots show the remaining HPO terms not plotted in the right-hand plots. Colour coding of columns represents the score type for each variant prioritisation tool where black = conservation scores, red = ensemble scores, blue = functional prediction scores and yellow=general prediction scores. The heatmap colour scale of the auROC (a) values has been adjusted to highlight moderate to strong performance by only colour coding auROC values greater than or equal to 0.7
Fig. 2
Fig. 2
Boxplots showing the auPRC values across the top performing variant prioritisation tools for selected HPO ‘phenotypic abnormality’ terms. The vertical red line indicates a strong performance value of 0.8
Fig. 3
Fig. 3
Heatmap showing auPRC for HPO ‘Phenotypic abnormality’ terms where top performing variant prioritisation tools differ by greater than 0.5. Colour coding of rows is by the parent HPO term. Row annotation includes term and [Number of ClinVar pathogenic variants (number of genes returned by Phenolyzer)]

References

    1. Cooper GM, Shendure J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat. Rev. Genet. 2011;12:628–640. - PubMed
    1. Biesecker LG, Green RC. Diagnostic clinical genome and exome sequencing. N. Engl. J. Med. 2014;371:1170. - PubMed
    1. Lionel, A. C. et al. Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test. Genet. Med.10.1038/gim.2017.119 (2017). - PMC - PubMed
    1. Meienberg J, Bruggmann R, Oexle K, Matyas G. Clinical sequencing: is WGS the better WES? Hum. Genet. 2016;135:359–362. - PMC - PubMed
    1. Stavropoulos DJ, et al. Whole genome sequencing expands diagnostic utility and improves clinical management in pediatric medicine. NPJ Genom. Med. 2016;1:15012. - PMC - PubMed