Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jan;36(1):e3.
doi: 10.1093/nar/gkm1106. Epub 2007 Dec 13.

Comprehensive viral oligonucleotide probe design using conserved protein regions

Affiliations

Comprehensive viral oligonucleotide probe design using conserved protein regions

Omar J Jabado et al. Nucleic Acids Res. 2008 Jan.

Abstract

Oligonucleotide microarrays have been applied to microbial surveillance and discovery where highly multiplexed assays are required to address a wide range of genetic targets. Although printing density continues to increase, the design of comprehensive microbial probe sets remains a daunting challenge, particularly in virology where rapid sequence evolution and database expansion confound static solutions. Here, we present a strategy for probe design based on protein sequences that is responsive to the unique problems posed in virus detection and discovery. The method uses the Protein Families database (Pfam) and motif finding algorithms to identify oligonucleotide probes in conserved amino acid regions and untranslated sequences. In silico testing using an experimentally derived thermodynamic model indicated near complete coverage of the viral sequence database.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Characteristics of the viral database. (a) Distribution of viral sequence length in EMBL nucleotide database July 2007, all sequence ≥5 kb were grouped together. (b) Growth of sequence diversity over time in the HIV-1 filtered database; unique sequences were identified by sequence clustering the 98% similarity level. (c) Growth of taxonomic classes over time; linear regression was used to project growth from year 2006 to 2010.
Figure 2.
Figure 2.
Impact of mismatches on fluorescence signal in microarray hybridization. West Nile virus (New York 1999 strain RNA) at 106 copies was spiked into 200 ng of human lung (background) RNA. Total nucleic acid was amplified, labeled and hybridized. After normalization of replicate arrays, log2 fluorescence was converted to Z-Scores. 95% confidence intervals of the mean for probes with the same number of mismatches were plotted. Dotted line indicates maximum number of mismatches yielding an acceptable fluorescence signal.
Figure 3.
Figure 3.
Comprehensive motif-based probe design. The EMBL viral database is clustered with a threshold of 98% nucleotide identity to create a non-redundant sequence database. Coding sequences are subjected to an amino acid motif search, and then probes are made from the underlying nucleic acid sequences. Similarly, nucleic acid motifs are found in non-coding sequences and used to make probes. Database coverage is checked; supplementary probes for highly divergent sequences are designed as necessary. Acronyms: Pfam—Protein Families database, MEME—Multiple Expectation maximization for Motif Elicitation, UTR—untranslated region, LTR—long terminal repeat.
Figure 4.
Figure 4.
Sequence counts for the July 2007 EMBL release (Pfam 22.0). *Only complete genomes of HIV-1 were included in this database.
Figure 5.
Figure 5.
Probe distribution for Dengue virus 1. The number of probes targeting each region of the Dengue genome (NC_001477) appears below the protein segment.
Figure 6.
Figure 6.
Gibbs free energy model of hybridization signal. The change in Gibbs free energy of probe-West Nile virus hybrids was computed. Aliquots of West Nile virus (New York 1999 strain RNA) at 106 copies were spiked into 200 ng of human lung (background) RNA. The fluorescent signal values of replicate arrays were log2 transformed, normalized, and converted to Z-scores. 95% confidence intervals of the mean for fluorescence versus Gibbs energy is plotted. Probe-virus hybrids with free energy ≤–32.5 kJ had high fluorescence; this value was chosen as the threshold for considering a probe likely to generate a strong signal when the target virus is present (dotted line).

Similar articles

Cited by

References

    1. An HJ, Cho NH, Lee SY, Kim IH, Lee C, Kim SJ, Mun MS, Kim SH, Jeong JK. Correlation of cervical carcinoma and precancerous lesions with human papillomavirus (HPV) genotypes detected with the HPV DNA chip microarray method. Cancer. 2003;97:1672–1680. - PubMed
    1. Wang D, Urisman A, Liu YT, Springer M, Ksiazek TG, Erdman DD, Mardis ER, Hickenbotham M, Magrini V, et al. Viral discovery and sequence recovery using DNA microarrays. PLoS Biol. 2003;1:E2. - PMC - PubMed
    1. Chiu CY, Rouskin S, Koshy A, Urisman A, Fischer K, Yagi S, Schnurr D, Eckburg PB, Tompkins LS, et al. Microarray detection of human parainfluenzavirus 4 infection associated with respiratory failure in an immunocompetent adult. Clin. Infect. Dis. 2006;43:e71–76. - PMC - PubMed
    1. Lin B, Wang Z, Vora GJ, Thornton JA, Schnur JM, Thach DC, Blaney KM, Ligler AG, Malanoski AP, et al. Broad-spectrum respiratory tract pathogen identification using resequencing DNA microarrays. Genome Res. 2006;16:527–535. - PMC - PubMed
    1. Townsend MB, Dawson ED, Mehlmann M, Smagala JA, Dankbar DM, Moore CL, Smith CB, Cox NJ, Kuchta RD, et al. Experimental evaluation of the FluChip diagnostic microarray for influenza virus surveillance. J. Clin. Microbiol. 2006;44:2863–2871. - PMC - PubMed

Publication types