Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Feb 8:6:23.
doi: 10.1186/1471-2105-6-23.

Recent Hits Acquired by BLAST (ReHAB): a tool to identify new hits in sequence similarity searches

Affiliations

Recent Hits Acquired by BLAST (ReHAB): a tool to identify new hits in sequence similarity searches

Joe Whitney et al. BMC Bioinformatics. .

Abstract

Background: Sequence similarity searching is a powerful tool to help develop hypotheses in the quest to assign functional, structural and evolutionary information to DNA and protein sequences. As sequence databases continue to grow exponentially, it becomes increasingly important to repeat searches at frequent intervals, and similarity searches retrieve larger and larger sets of results. New and potentially significant results may be buried in a long list of previously obtained sequence hits from past searches.

Results: ReHAB (Recent Hits Acquired from BLAST) is a tool for finding new protein hits in repeated PSI-BLAST searches. ReHAB compares results from PSI-BLAST searches performed with two versions of a protein sequence database and highlights hits that are present only in the updated database. Results are presented in an easily comprehended table, or in a BLAST-like report, using colors to highlight the new hits. ReHAB is designed to handle large numbers of query sequences, such as whole genomes or sets of genomes. Advanced computer skills are not needed to use ReHAB; the graphics interface is simple to use and was designed with the bench biologist in mind.

Conclusions: This software greatly simplifies the problem of evaluating the output of large numbers of protein database searches.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Organization of ReHAB processing steps. ReHAB is set up as four main components: the client, the Hits Database, the server and the back end. See text for details.
Figure 2
Figure 2
ReHAB management console. A database is selected from the list on the left, and statistics are displayed on the right. More hits than target sequences are displayed because query sequences can match multiple targets. Double-clicking on the database or selecting an option from the Action menu allows users to browse the selected database.
Figure 3
Figure 3
Hits browser window. The ReHAB database is searched by selecting an organism name, then choosing the desired highlighting and filtering options. Clicking on "Show Summary" opens a new window to display the results.
Figure 4
Figure 4
Query sequences with new hits are highlighted. A user defined threshold (in the Browser window) is used to define the minimum bit-score that is highlighted in red, and all new hits with lower scores are highlighted in yellow. The Latest Hit column indicates the date of the most recent hit. Those with no entry in this column have no hits in the database (for example, VARV-Bsh-A33.5L). Sorting of the entries can be changed by clicking on the column heading. Details about the hits can be obtained by right-clicking on the entry or selecting an option in the Action menu.
Figure 5
Figure 5
Analysis of hits. Hits can be viewed in A) HTML output, showing all hits listed in order of descending score, followed by a pairwise Needle alignment of the query and target sequence. The Info hyperlink links to the NCBI entry for the target sequence, and the score hyperlink takes the user to the Needle alignment. B) The Hits Manger window, which allows the user to sort hits and view pairwise or multiple alignments, or view selected sequences in FASTA format. A global alignment is shown between the query sequence and the top scoring new hit.
Figure 6
Figure 6
ReHAB set up for other users. A) Different laboratories in a department could have different query databases, which can be accessed as described in the text. B) The sequences within a lab's database could be annotated with individual lab member's names, or other identifying information, permitting individuals to view results for their own sequences of interest. In this way, large numbers of sequences of interest to a lab can be run simultaneously and frequently, and individuals can then browse results.

Similar articles

Cited by

References

    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004:D23–26. doi: 10.1093/nar/gkh045. - DOI - PMC - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. - DOI - PMC - PubMed
    1. Upton C, Slack S, Hunter AL, Ehlers A, Roper RL. Poxvirus orthologous clusters: toward defining the minimum essential poxvirus genome. J Virol. 2003;77:7590–7600. doi: 10.1128/JVI.77.13.7590-7600.2003. - DOI - PMC - PubMed
    1. DBWatcher http://www-igbmc.u-strasbg.fr/BioInfo/LocalDoc/DBWatcher/
    1. SEALS http://www.ncbi.nlm.nih.gov/CBBresearch/Walker/SEALS/index.html

Publication types