Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug;43(8):986-997.
doi: 10.1002/humu.24298. Epub 2021 Dec 2.

Annotating and prioritizing genomic variants using the Ensembl Variant Effect Predictor-A tutorial

Affiliations

Annotating and prioritizing genomic variants using the Ensembl Variant Effect Predictor-A tutorial

Sarah E Hunt et al. Hum Mutat. 2022 Aug.

Abstract

The Ensembl Variant Effect Predictor (VEP) is a freely available, open-source tool for the annotation and filtering of genomic variants. It predicts variant molecular consequences using the Ensembl/GENCODE or RefSeq gene sets. It also reports phenotype associations from databases such as ClinVar, allele frequencies from studies including gnomAD, and predictions of deleteriousness from tools such as Sorting Intolerant From Tolerant and Combined Annotation Dependent Depletion. Ensembl VEP includes filtering options to customize variant prioritization. It is well supported and updated roughly quarterly to incorporate the latest gene, variant, and phenotype association information. Ensembl VEP analysis can be performed using a highly configurable, extensible command-line tool, a Representational State Transfer application programming interface, and a user-friendly web interface. These access methods are designed to suit different levels of bioinformatics experience and meet different needs in terms of data size, visualization, and flexibility. In this tutorial, we will describe performing variant annotation using the Ensembl VEP web tool, which enables sophisticated analysis through a simple interface.

Keywords: VEP; filtering; variant annotation; variant prioritisation; “molecular consequence”.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interests

Paul Flicek is a member of the scientific advisory boards of Fabric Genomics, Inc., and Eagle Genomics, Ltd.

Figures

Figure 1
Figure 1. The Ensembl VEP web interface showing species/assembly selection, data input, transcript set selection, and additional groups of configuration options
Figure 2
Figure 2. The “Identifiers” section, which allows the selection of gene, protein, and HGVS identifiers
Figure 3
Figure 3. The “Variants and frequency data” section, which allows the selection of information known about variants at the same location
Figure 4
Figure 4. The “Additional annotations” section, which allows the selection of transcript, protein domain, regulatory region, and phenotype annotations
Figure 5
Figure 5. The “Predictions” section, which allows the selection of different pathogenicity, splicing, and conservation predictions
Figure 6
Figure 6. Filtering and advanced options
Figure 7
Figure 7. The results page with summary statistics and options for filtering and downloading the results table
Figure 8
Figure 8. The results table showing predicted molecular consequences and links to the location and overlapping genes and variant displays within the Ensembl genome browser
Figure 9
Figure 9
Results table for example input VCF file showing predicted molecular consequences and links to the location, gene, and variant tabs within the Ensembl genome browser for overlapping features as well as SIFT and PolyPhen-2 predictions and allele frequencies for continental populations for the 1000 Genomes project. VCF, variant call format
Figure 10
Figure 10
Results table for example input VCF file showing clinical significance, associated PubMed IDs, and associated phenotypes. VCF, variant call format

References

    1. 1000 Genomes Project Consortium. Auton A, Abecasis GR, Steering committee. Altshuler DM, Abecasis GR. A global reference for human genetic variation. Nature. 2015;526(7571):68–74. doi: 10.1038/nature15393. - DOI - PMC - PubMed
    1. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Sunyaev SR. A method and server for predicting damaging missense mutations. Nature Methods. 2010;7(4):248–249. doi: 10.1038/nmeth0410-248. - DOI - PMC - PubMed
    1. Armstrong DR, Berrisford JM, Conroy MJ, Gutmanas A, Anyango S, Choudhary P, Velankar S. PDBe: Improved findability of macromolecular structure data in the PDB. Nucleic Acids Research. 2019;35:gkz990. doi: 10.1093/nar/gkz990. - DOI - PMC - PubMed
    1. Blum M, Chang H-Y, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, Finn RD. The InterPro protein families and domains database: 20 years on. Nucleic Acids Research. 2021;49(D1):D344–D354. doi: 10.1093/nar/gkaa977. - DOI - PMC - PubMed
    1. Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, Parkinson H. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Research. 2019;47(D1):D1005–D1012. doi: 10.1093/nar/gky1120. - DOI - PMC - PubMed

Publication types