Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan;39(1):61-68.
doi: 10.1002/humu.23348. Epub 2017 Oct 17.

VariantValidator: Accurate validation, mapping, and formatting of sequence variation descriptions

Affiliations

VariantValidator: Accurate validation, mapping, and formatting of sequence variation descriptions

Peter J Freeman et al. Hum Mutat. 2018 Jan.

Abstract

The Human Genome Variation Society (HGVS) variant nomenclature is widely used to describe sequence variants in scientific publications, clinical reports, and databases. However, the HGVS recommendations are complex and this often results in inaccurate variant descriptions being reported. The open-source hgvs Python package (https://github.com/biocommons/hgvs) provides a programmatic interface for parsing, manipulating, formatting, and validating of variants according to the HGVS recommendations, but does not provide a user-friendly Web interface. We have developed a Web-based variant validation tool, VariantValidator (https://variantvalidator.org/), which utilizes the hgvs Python package and provides additional functionality to assist users who wish to accurately describe and report sequence-level variations that are compliant with the HGVS recommendations. VariantValidator was designed to ensure that users are guided through the intricacies of the HGVS nomenclature, for example, if the user makes a mistake, VariantValidator automatically corrects the mistake if it can, or provides helpful guidance if it cannot. In addition, VariantValidator has the facility to interconvert genomic variant descriptions in HGVS and Variant Call Format with a degree of accuracy that surpasses most competing solutions.

Keywords: HGVS variant nomenclature; VCF; reference sequences; sequence variants; sequence variation; validation; variant call format.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Mapping of variants onto alternative transcripts. Submitted variant descriptions are automatically mapped, via the selected genome build (GRCh38), onto all other transcripts that overlap the same genomic position. In this example, NM_182763.2:c.688+403C>T, which is intronic with respect to MCL1 transcript variant 2 mRNA, is mapped to an exonic variant in MCL1 transcript variant 1 mRNA, NM_021960.4:c.740C>T. The same initial variant description also maps to an exonic variant in MCL1 transcript variant 3 mRNA, NM_001197320.1:c.281C>T
Figure 2
Figure 2
Variant descriptions at exon/intron boundaries. This illustrates how a three‐base deletion in the COL1A2 gene at the junction of the 3′ end of exon 19 with the adjacent intron might be described in two different ways in the context of the RefSeq transcript reference sequence NM_000089.3. Description A shows that the three deleted bases can be described at position NM_000089.3:c.1033_1035 where the deleted bases are GTT, but Description B shows that the variant can be normalized and described at position NM_000089.3:c.1035_1035+2 where the deleted bases are TGT. The latter description corresponds with the genomic variant description NC_000007.13:g.94039133_94039135delTGT. Formally, intronic variants described in the context of a transcript reference sequence must be accompanied by a genomic reference sequence to allow full verification of the variant. This is illustrated by Description C

Similar articles

Cited by

References

    1. Aken, B. L. , Achuthan, P. , Akanni, W. , Amode, M. R. , Bernsdorff, F. , Bhai, J. , … Flicek, P. (2017). Ensembl 2017. Nucleic Acids Research, 45, D635–D642. - PMC - PubMed
    1. Dalgleish, R. , Flicek, P. , Cunningham, F. , Astashyn, A. , Tully, R. E. , Proctor, G. , … Maglott, D. R. (2010). Locus Reference Genomic sequences: An improved basis for describing human DNA variants. Genome Medicine, 2, 24. - PMC - PubMed
    1. Danecek, P. , Auton, A. , Abecasis, G. , Albers, C. A. , Banks, E. , DePristo, M. A. , … Durbin, R. (2011). The variant call format and VCFtools. Bioinformatics, 27, 2156–2158. - PMC - PubMed
    1. Deans, Z. , Fairley, J. A. , den Dunnen, J. T. , & Clark, C. (2016). HGVS nomenclature in practice: An example from the United Kingdom National External Quality Assessment Scheme. Human Mutation, 37, 576–578. - PubMed
    1. den Dunnen, J. T. , Dalgleish, R. , Maglott, D. R. , Hart, R. K. , Greenblatt, M. S. , McGowan‐Jordan, J. , … Taschner, P. E. M. (2016). HGVS recommendations for the description of sequence variants: 2016 update. Human Mutation, 37, 564–569. - PubMed

Publication types