Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar;16(2):255-64.
doi: 10.1093/bib/bbu008. Epub 2014 Mar 12.

Rich annotation of DNA sequencing variants by leveraging the Ensembl Variant Effect Predictor with plugins

Rich annotation of DNA sequencing variants by leveraging the Ensembl Variant Effect Predictor with plugins

Michael Yourshaw et al. Brief Bioinform. 2015 Mar.

Abstract

High-throughput DNA sequencing has become a mainstay for the discovery of genomic variants that may cause disease or affect phenotype. A next-generation sequencing pipeline typically identifies thousands of variants in each sample. A particular challenge is the annotation of each variant in a way that is useful to downstream consumers of the data, such as clinical sequencing centers or researchers. These users may require that all data storage and analysis remain on secure local servers to protect patient confidentiality or intellectual property, may have unique and changing needs to draw on a variety of annotation data sets and may prefer not to rely on closed-source applications beyond their control. Here we describe scalable methods for using the plugin capability of the Ensembl Variant Effect Predictor to enrich its basic set of variant annotations with additional data on genes, function, conservation, expression, diseases, pathways and protein structure, and describe an extensible framework for easily adding additional custom data sets.

Keywords: DNA sequencing; Ensembl Variant Effect Predictor; annotation; database; plugin.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Overview of DNA sequencing and annotation. (a) DNA-sequencing pipeline. Fragmented genomic DNA is sequenced by a NGS and aligned to a reference genome. Each locus is genotyped and variants from the reference are output to a VCF file. (b) Rich variant annotation. Multiple data sets are stored on a local database server. Modular plugins integrated with the Ensembl VEP create an output file with rich annotations of each variant.

References

    1. Gonzaga-Jauregui C, Lupski JR, Gibbs RA. Human genome sequencing in health and disease. Annu Rev Med. 2012;63:35–61. - PMC - PubMed
    1. Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26:1135–45. - PubMed
    1. Bamshad MJ, Ng SB, Bigham AW, et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet. 2011;12:745–55. - PubMed
    1. Wan J, Yourshaw M, Mamsa H, et al. Mutations in the RNA exosome component gene EXOSC3 cause pontocerebellar hypoplasia and spinal motor neuron degeneration. Nat Genet. 2012;44:704–8. - PMC - PubMed
    1. Pabinger S, Dander A, Fischer M, et al. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform. 2014;15(2):256–78. - PMC - PubMed

Publication types

MeSH terms