Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep;11(9):2441-2452.
doi: 10.1002/2211-5463.13261. Epub 2021 Aug 11.

JWES: a new pipeline for whole genome/exome sequence data processing, management, and gene-variant discovery, annotation, prediction, and genotyping

Affiliations

JWES: a new pipeline for whole genome/exome sequence data processing, management, and gene-variant discovery, annotation, prediction, and genotyping

Zeeshan Ahmed et al. FEBS Open Bio. 2021 Sep.

Abstract

Whole genome and exome sequencing (WGS/WES) are the most popular next-generation sequencing (NGS) methodologies and are at present often used to detect rare and common genetic variants of clinical significance. We emphasize that automated sequence data processing, management, and visualization should be an indispensable component of modern WGS and WES data analysis for sequence assembly, variant detection (SNPs, SVs), imputation, and resolution of haplotypes. In this manuscript, we present a newly developed findable, accessible, interoperable, and reusable (FAIR) bioinformatics-genomics pipeline Java based Whole Genome/Exome Sequence Data Processing Pipeline (JWES) for efficient variant discovery and interpretation, and big data modeling and visualization. JWES is a cross-platform, user-friendly, product line application, that entails three modules: (a) data processing, (b) storage, and (c) visualization. The data processing module performs a series of different tasks for variant calling, the data storage module efficiently manages high-volume gene-variant data, and the data visualization module supports variant data interpretation with Circos graphs. The performance of JWES was tested and validated in-house with different experiments, using Microsoft Windows, macOS Big Sur, and UNIX operating systems. JWES is an open-source and freely available pipeline, allowing scientists to take full advantage of all the computing resources available, without requiring much computer science knowledge. We have successfully applied JWES for processing, management, and gene-variant discovery, annotation, prediction, and genotyping of WGS and WES data to analyze variable complex disorders. In summary, we report the performance of JWES with some reproducible case studies, using open access and in-house generated, high-quality datasets.

Keywords: bioinformatics application; database; gene; variants; whole exome; whole genome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1
JWES pipeline for the whole genome and exome data processing, modeling, and downstream analysis. The figure explains all the data processing and analysis steps, which include input, QC, trimming, alignment, sort, mark duplicates, insert size, sort and index, create realignment targets, realign indels, Base Quality Score Recalibration (BQSR), analyze covariates, apply BQSR, recalibrate, extract filtered, compute coverage, annotate and predict.
Fig. 2
Fig. 2
JWES pipeline data and workflow. The figure explains overall roadmap of JWES, which includes input preparation, automatics script generation, output files management, and variants data storage in database
Fig. 3
Fig. 3
JWES database design. The figure explains ERD of JWES database, which includes three tables: WES Info, WES Samples, and WES Variant
Fig. 4
Fig. 4
JWES visualization. The figure presents Circos graph plotting all the variants for all chromosomes. The internal histogram represents the total number of variants found in the protein‐coding genes, and the external histogram represents variants found in the noncoding genes

Similar articles

Cited by

References

    1. Meyers BC, Scalabrin S and Morgante M (2004) Mapping and sequencing complex genomes: let's get physical! Nat Rev Genet 5, 578–588. - PubMed
    1. Venter JC, Adams MD, Myers EWet al. (2001) The sequence of the human genome. Science 291, 1304–1351. - PubMed
    1. International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431, 931–945. 10.1038/nature03001 - DOI - PubMed
    1. International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921. - PubMed
    1. Little P (2001) The end of all human DNA maps? Nat Genet 27, 229–230. 10.1038/85766 - DOI - PubMed