Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004 Jul 14:5:96.
doi: 10.1186/1471-2105-5-96.

Base-By-Base: single nucleotide-level analysis of whole viral genome alignments

Affiliations
Comparative Study

Base-By-Base: single nucleotide-level analysis of whole viral genome alignments

Ryan Brodie et al. BMC Bioinformatics. .

Abstract

Background: With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools.

Results: A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files.

Conclusion: Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Use of BBB to identify and correct small mis-aligned regions within alignments of large virus genomes. (a) Region of vaccinia virus (strain WR) and cowpox virus (strain Brighton Red) genomes aligned by DIALIGN2 (b) Manually corrected version of region shown in (a). Gaps and mismatched nucleotides are shown as red and navy blue boxes or bars in between the 2 sequences, respectively.
Figure 2
Figure 2
Section of a gene feature report comparing SARS coronavirus strains BJ03 and BJ04. The Base-By-Base table header describes the listed data. Since the non-structural proteins (NSP) that we have annotated in VOCs are fragments of the ORF1a and ORF1ab polyproteins, the length differences appear as negative numbers
Figure 3
Figure 3
Scored similarity visual summary of an alignment of coronavirus spike glycoproteins. PAM250 substitution matrix was used to shade the figure; a grey scale indicates similarity (black-perfect match; white-low) and gaps are shown as dashes.
Figure 4
Figure 4
Results of a fuzzy search. The query sequence was 'TGCACGG', allowing 2 mismatches (user option), in 2 SARS coronavirus genomes. Upper panel, search results window with hits sorted by confidence (percent match) and the location of the hit within the genome; lower panel, hits displayed as black arrows in sequence alignment window.
Figure 5
Figure 5
Web interface of a Base-By-Base summary. Display of 11 SARS coronavirus genomes. Pink boxes at the top of the figure represent genes; annotations are derived from the VOCs database. After comparison to the consensus, nucleotide differences are displayed as vertical blue ticks or bars, and insertions and deletions are displayed in green and red, respectively.

References

    1. Moss B. Poxviruses. In: Knipe DM and Howley P M, editor. Fields Virology. Vol. 2. Philadelphia, Lippincott Williams & Wilkins; 2001. pp. 2849–2884.
    1. Ehlers A, Osborne J, Slack S, Roper RL, Upton C. Poxvirus Orthologous Clusters (POCs) Bioinformatics. 2002;18:1544–1545. doi: 10.1093/bioinformatics/18.11.1544. - DOI - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. JMolBiol. 1990;215:403–410. doi: 10.1006/jmbi.1990.9999. - DOI - PubMed
    1. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. - DOI - PMC - PubMed
    1. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. - PMC - PubMed

Publication types

LinkOut - more resources