Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 3;11(2):e0147697.
doi: 10.1371/journal.pone.0147697. eCollection 2016.

MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics

Affiliations

MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics

Ram Vinay Pandey et al. PLoS One. .

Abstract

Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The workflow of MutAid.
The MutAid pipeline can be run with a single command. Sanger sequencing data analysis has one start point and the flow of analysis runs from top to the bottom, illustrated by a black arrow. NGS has three starting points: 1) raw reads (red color), 2) high quality FASTQ file (blue color)—in this case first step is skipped and 3) mapped reads in BAM or SAM file format (green color)—in this case step 1 and 2 are skipped. * Step is optional.
Fig 2
Fig 2. Venn diagrams of called SNVs in MutAid by four variant callers with BWA, Bowtie2 and GSNAP mapping (A) Freebayes (B) GATK-HaplotypeCaller. (C) SAMTOOLS and (D) VarScan2.
GATK shows 93.29% overlap between at least two mappers whereas Varscan2 shows least overlap among all four variant callers with 78%. SAMTOOLS and Freebayes show 92.23% and 88%, respectively, agreement with at least two mappers.
Fig 3
Fig 3. Venn diagrams of called INDELs in MutAid by four variant callers using BWA, Bowtie2 and GSNAP mapping results. (A) Freebayes (B) GATK-HaplotypeCaller. C) SAMTOOLS and (D) VarScan2.
GATK shows 90.78% overlap between at least two mappers and SAMTOOLS shows least overlap among all four variant callers with 74.34%. Varscan2 and Freebayes show 76.56% and 83.70%, respectively, agreement with at least two mappers.
Fig 4
Fig 4. Venn diagrams of called SNV by four variant callers using (A) BWA (B) Bowtie2 (C) GSNAP with same mapper.
Result shows that 75% - 84% SNVs are common with at least two out of four variant callers. With all 3 mappers Varscan2 identified novel SNVs from 16% - 24%.
Fig 5
Fig 5. Venn diagrams of called INDEL by four variant callers using (A) BWA (B) Bowtie2 (C) GSNAP with same mapper.
Consistent with SNV results more than 78% INDELS are identified by at least two variant callers.
Fig 6
Fig 6. Visualization of SNVs in IGV called by MutAid pipeline with Illumina and Sanger sequencing data analysis.
MutAid produces BAM files for NGS and Sanger, which can be loaded into IGV to view and confirm the identified variants. In blue color we can see that SNV (T>C) has been identified by NGS (top panel) and confirmed by Sanger sequencing (middle panel).
Fig 7
Fig 7. Visualization of conservation track in UCSC genome browser for novel variants.
MutAid constructs a direct link to the UCSC genome browser for all variants including novel variants. On top, reference nucleotides are displayed and in the bottom panel (highlighted with green color) the conservation track of several species is displayed. To confirm novel variants, conservation analysis can be performed for each mutation position. A novel mutation might be ignored if a position has poor conservation among the species (pointed by red color arrow). A novel mutation may be further analyzed if the position is highly conserved (pointed by blue color arrow).

Similar articles

Cited by

References

    1. Ardeshirdavani A, Souche E, Dehaspe L, Van Houdt J, Vermeesch JR, Moreau Y NGS-Logistics: federated analysis of NGS sequence variants across multiple locations. Genome Med. 2014. September 17;6(9):71 10.1186/s13073-014-0071-9 - DOI - PMC - PubMed
    1. Hagemann IS, Devarakonda S, Lockwood CM, Spencer DH, Guebert K, Bredemeyer AJ, et al. Clinical next-generation sequencing in patients with non-small cell lung cancer. Cancer. 2014. 10.1002/cncr.29089 - DOI - PubMed
    1. Rehm HL Disease-targeted sequencing: a cornerstone in the clinic. Nat Rev Genet. 2013;14(4):295–300. 10.1038/nrg3463 - DOI - PMC - PubMed
    1. Shanks ME, Downes SM, Copley RR, Lise S, Broxholme J, Hudspith KA, et al. Next-generation sequencing (NGS) as a diagnostic tool for retinal degeneration reveals a much higher detection rate in early-onset disease. Eur J Hum Genet. 2013;21(3):274–80. 10.1038/ejhg.2012.172 - DOI - PMC - PubMed
    1. Renkema KY, Stokman MF, Giles RH, Knoers NV. Next-generation sequencing for research and diagnostics in kidney disease. Nat Rev Nephrol. 2014;10(8):433–44. 10.1038/nrneph.2014.95 - DOI - PubMed