Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul;40(Web Server issue):W340-7.
doi: 10.1093/nar/gks561. Epub 2012 Jun 11.

SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments

Affiliations

SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments

Pravech Ajawatanawong et al. Nucleic Acids Res. 2012 Jul.

Abstract

Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Work flow for SeqFIRE, a user-friendly web application for automated identification and extraction of indels and conserved blocks from MSAs. The workflow for the (A) indel and (B) conserved block modules of SeqFIRE are shown on the left and right, respectively. Boxes indicate processes and diamonds indicate suggested parameters for specific steps. Numbers in the upper right-hand corners of boxes indicate different steps in the process as described in the text. For the conserved block module, MNS refers to minimum non-conserved site threshold and MCS to minimum conserved site threshold.
Figure 2.
Figure 2.
Example output from the SeqFIRE indel module. The module produces five different outputs (A–E). The alignment with indel annotation is visualized in Jalview (A) and text mode (B). The indels list is a numbered sequential list of all indels including the location of the indel in the alignment and the full sequence of the indel region for all taxa (C). The simple indel matrix is a NEXUS-formatted matrix with all simple indels scored as 0 or 1 (absence or presence) for all taxa (D). The indel module also outputs an alignment with all indel regions removed, also in NEXUS format (E). Outputs B–E can be downloaded as a single file or separately using links at the top of the output page.
Figure 3.
Figure 3.
Example of SeqFIRE and GBlocks detection of conserved alignment regions under high stringency criteria. A fragment of BAliBASE reference 1 V2 alignment number BB12001 is shown between positions 129 and 187. The gray bars below the alignment indicate the conserved blocks detected by SeqFIRE and the black bars show the conserved blocks detected by GBlocks. The dark background within the alignment indicates conserved amino acids.

References

    1. Aniba MR, Poch O, Thompson JD. Issues in bioinformatics benchmarking: the case study of multiple sequence alignment. Nucleic Acids Res. 2010;38:7353–7363. - PMC - PubMed
    1. Lockwood CA. Adaptation and functional integration in primate phylogenetics. J. Hum. Evol. 2007;52:490–503. - PubMed
    1. Rokas A, Holland PWH. Rare genomic changes as a tool for phylogenetics. Trends Ecol. Evol. 2000;15:454–459. - PubMed
    1. Baldauf SL. A search for the origins of animals and fungi: comparing and combining molecular data. Am. Nat. 1999;154:178–188. - PubMed
    1. de Jong WW, van Dijk MAM, Poux C, Kappé G, van Rheede T, Madsen O. Indels in protein-coding sequences of Euarchontoglires constrain the rooting of the eutherian tree. Mol. Phylogenet. Evol. 2003;28:328–340. - PubMed

Publication types