Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Sep 28:6:238.
doi: 10.1186/1471-2105-6-238.

Mathematical design of prokaryotic clone-based microarrays

Affiliations

Mathematical design of prokaryotic clone-based microarrays

Bart Pieterse et al. BMC Bioinformatics. .

Abstract

Background: Clone-based microarrays, on which each spot represents a random genomic fragment, are a good alternative to open reading frame-based microarrays, especially for microorganisms for which the complete genome sequence is not available. Since the generation of a genomic DNA library is a random process, it is beforehand uncertain which genes are represented. Nevertheless, the genome coverage of such an array, which depends on different variables like the insert size and the number of clones in the library, can be predicted by mathematical approaches. When applying the classical formulas that determine the probability that a certain sequence is represented in a DNA library at the nucleotide level, massive amounts of clones would be necessary to obtain a proper coverage of the genome.

Results: This paper describes the development of two complementary equations for determining the genome coverage at the gene level. The first equation predicts the fraction of genes that is represented on the array in a detectable way and cover at least a set part (the minimal insert coverage) of the genomic fragment by which these genes are represented. The higher this minimal insert coverage, the larger the chance that changes in expression of a specific gene can be detected and attributed to that gene. The second equation predicts the fraction of genes that is represented in spots on the array that only represent genes from a single transcription unit, which information can be interpreted in a quantitative way.

Conclusion: Validation of these equations shows that they form reliable tools supporting optimal design of prokaryotic clone-based microarrays.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic representation of the criteria that were applied to determine whether gene specific information is generated by a specific insert. The upper line represents a genome fragment in which the block arrows represent genes. Arrows with a gray filling belong to the same transcription unit. The thinner lines represent possible locations of the inserts. The dashed lines represent inserts for which no gene specific information can be generated, since they contain genomic material that possibly belongs to another transcription unit.
Figure 2
Figure 2
Histogram representations of the residuals from the validation of the MIC-equation (A) and the GSI-equation (B).
Figure 3
Figure 3
Contour plots of the predicted fractions of represented genes with a minimal insert coverage of 25% (A), 50% (B), or 75% (C) as a function of the number of clones (N) and the insert size (IS) for a prokaryote with a genome size of 4 Mbp. The predicted fractions are depicted in the plot on top of the lines by which they are represented.
Figure 4
Figure 4
Contour plot of the predicted fraction of represented genes for which gene specific information could be generated as a function of the number of clones (N) and the insert size (IS) for a prokaryote with a genome size of 4 Mbp, an average number of genes per transcription unit (R) of 1.8, and a minimal overlap between the insert and the gene of 100 bp. The predicted fractions are depicted in the plot on top of the lines by which they are represented.

Similar articles

Cited by

References

    1. Cho JC, Tiedje JM. Bacterial species determination from DNA-DNA hybridization by using genome fragments and DNA microarrays. Appl Environ Microbiol. 2001;67:3677–3682. doi: 10.1128/AEM.67.8.3677-3682.2001. - DOI - PMC - PubMed
    1. Askenazi M, Driggers EM, Holtzman DA, Norman TC, Iverson S, Zimmer DP, Boers ME, Blomquist PR, Martinez EJ, Monreal AW, Feibelman TP, Mayorga ME, Maxon ME, Sykes K, Tobin JV, Cordero E, Salama SR, Trueheart J, Royer JC, Madden KT. Integrating transcriptional and metabolite profiles to direct the engineering of lovastatin-producing fungal strains. Nat Biotechnol. 2003;21:150–156. doi: 10.1038/nbt781. - DOI - PubMed
    1. Clark L, Carbon J. A colony bank containing synthetic Col E1 hybrids representative of the entire E. coli genome. Cell. 1976;9:91–99. doi: 10.1016/0092-8674(76)90055-6. - DOI - PubMed
    1. Lander ES, Waterman MS. Genomic mapping by fingerprinting random clones: a mathematical analysis. Genomics. 1988;2:231–239. doi: 10.1016/0888-7543(88)90007-9. - DOI - PubMed
    1. Akopyants NS, Clifton SW, Martin J, Pape D, Wylie T, Li L, Kissinger JC, Roos DS, Beverley SM. A survey of the Leishmania major Friedlin strain V1 genome by shotgun sequencing: a resource for DNA microarrays and expression profiling. Mol Biochem Parasitol. 2001;113:337–340. doi: 10.1016/S0166-6851(01)00227-4. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources