Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 14:9:537.
doi: 10.3389/fgene.2018.00537. eCollection 2018.

SeqVItA: Sequence Variant Identification and Annotation Platform for Next Generation Sequencing Data

Affiliations

SeqVItA: Sequence Variant Identification and Annotation Platform for Next Generation Sequencing Data

Prashanthi Dharanipragada et al. Front Genet. .

Abstract

The current trend in clinical data analysis is to understand how individuals respond to therapies and drug interactions based on their genetic makeup. This has led to a paradigm shift in healthcare; caring for patients is now 99% information and 1% intervention. Reducing costs of next generation sequencing (NGS) technologies has made it possible to take genetic profiling to the clinical setting. This requires not just fast and accurate algorithms for variant detection, but also a knowledge-base for variant annotation and prioritization to facilitate tailored therapeutics based on an individual's genetic profile. Here we show that it is possible to provide a fast and easy access to all possible information about a variant and its impact on the gene, its protein product, associated pathways and drug-variant interactions by integrating previously reported knowledge from various databases. With this objective, we have developed a pipeline, Sequence Variants Identification and Annotation (SeqVItA) that provides end-to-end solution for small sequence variants detection, annotation and prioritization on a single platform. Parallelization of the variant detection step and with numerous resources incorporated to infer functional impact, clinical relevance and drug-variant associations, SeqVItA will benefit the clinical and research communities alike. Its open-source platform and modular framework allows for easy customization of the workflow depending on the data type (single, paired, or pooled samples), variant type (germline and somatic), and variant annotation and prioritization. Performance comparison of SeqVItA on simulated data and detection, interpretation and analysis of somatic variants on real data (24 liver cancer patients) is carried out. We demonstrate the efficacy of annotation module in facilitating personalized medicine based on patient's mutational landscape. SeqVItA is freely available at https://bioinf.iiit.ac.in/seqvita.

Keywords: INDELs; NGS; SNPs; annotation; personalized medicine; platform; sequence variants.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Workflow of SeqVItA for identification, annotation and prioritization of sequence variants in WGS, WES, or TS data. Het, Heterozygous; Homo, Homozygous; LOH, Loss of heterozygosity; MAF, Minor allele frequency.
Figure 2
Figure 2
Performance of SeqVItA in detecting SNVs in simulated data shown. F-score values for detecting homozygous “triangle” and heterozygous “square” SNVs with read length 50 bp (empty symbols) and 100 bp (filled symbols). Minimum coverage threshold = 10 and Base quality ≥15.
Figure 3
Figure 3
Performance of SeqVItA in detecting INDELs in simulated data shown. F-score values for predicting (A) homozygous (Homo) and (B) heterozygous (Het) INDELs of various sizes: 1 bp (“diamond”), 2 bp (“square”), 5 bp (“triangle”) and 10 bp (“circle”) for two read lengths 50 bp (empty symbols) and 100 bp (filled symbols). Minimum coverage threshold = 10 and Base quality ≥ 15.
Figure 4
Figure 4
Performance comparison of SeqVItA with BCFtools, VarScan2 and GATK on simulated data at three sequencing depths 20 ×, 40 ×, and 60 × in detecting homozygous (Homo) and heterozygous (Het) (A) SNVs, (B,C) insertions (Ins), and (D–E) deletions (Del). Read length = 100 bp, and base quality threshold ≥ 15.
Figure 5
Figure 5
Mutational landscape of somatic sequence variants identified in 24 HCC patient samples (intronic and intergenic variants excluded). Each column corresponds to each patient sample and each row represents a gene.
Figure 6
Figure 6
Clustering of HCC patients based on somatically mutated genes.
Figure 7
Figure 7
Interaction between recurrently mutated genes from STRING database. Pathway enrichment analysis of these mutated genes indicate that cell cycle (shown in red) and PI3K/AKT (shown in blue) pathways are affected.
Figure 8
Figure 8
(A) Total number of somatic variants called and (B) Pair-wise agreement (0–1 scores) between SNVs and INDELs predicted by SeqVItA, Mutect2, and VarScan2.

Similar articles

Cited by

References

    1. 1000 Genomes Project Consortium. Auton A., Brooks L. D., Durbin R. M., Garrison E. P., Kang H. M., et al. . (2015). A global reference for human genetic variation. Nature 526, 68–74. 10.1038/nature15393 - DOI - PMC - PubMed
    1. Adzhubei I., Jordan D. M., Sunyaev S. R. (2013). Predicting functional effect of human missense mutations using polyphen-2. Curr. Protoc. Hum. Genet. Chapter 7:Unit7.20. 10.1002/0471142905.hg0720s76 - DOI - PMC - PubMed
    1. Afgan E., Baker D., Coraor N., Chapman B., Nekrutenko A., Taylor J. (2010). Galaxy CloudMan: delivering cloud compute clusters. BMC Bioinformatics 11:S4. 10.1186/1471-2105-11-S12-S4 - DOI - PMC - PubMed
    1. Akhdar H., El Shamieh S., Musso O., Désert R., Joumaa W., Guyader D., et al. . (2016). The rs3957357C>T SNP in GSTA1 is associated with a higher risk of occurrence of hepatocellular carcinoma in european individuals. PLoS ONE 11:e0167543. 10.1371/journal.pone.0167543 - DOI - PMC - PubMed
    1. Alexander J., Potamianou H., Xing J., Deng L., Karagiannidis I., Tsetsos F., et al. . (2016). Targeted re-sequencing approach of candidate genes implicates rare potentially functional variants in tourette syndrome etiology. Front. Neurosci. 10:428. 10.3389/fnins.2016.00428 - DOI - PMC - PubMed

LinkOut - more resources