Comparing whole genomes using DNA microarrays

David Gresham¹, Maitreya J Dunham, David Botstein

Affiliations

Affiliation

¹ Lewis-Sigler Institute for Integrative Genomics, Department of Molecular Biology, Carl Icahn Laboratory, Princeton University, Princeton, New Jersey 08544, USA. dgresham@genomics.princeton.edu

PMID: 18347592
PMCID: PMC7097741
DOI: 10.1038/nrg2335

Review

Comparing whole genomes using DNA microarrays

David Gresham et al. Nat Rev Genet. 2008 Apr.

. 2008 Apr;9(4):291-302.

doi: 10.1038/nrg2335.

Authors

David Gresham¹, Maitreya J Dunham, David Botstein

Affiliation

¹ Lewis-Sigler Institute for Integrative Genomics, Department of Molecular Biology, Carl Icahn Laboratory, Princeton University, Princeton, New Jersey 08544, USA. dgresham@genomics.princeton.edu

PMID: 18347592
PMCID: PMC7097741
DOI: 10.1038/nrg2335

Abstract

The rapid accumulation of complete genomic sequences offers the opportunity to carry out an analysis of inter- and intra-individual genome variation within a species on a routine basis. Sequencing whole genomes requires resources that are currently beyond those of a single laboratory and therefore it is not a practical approach for resequencing hundreds of individual genomes. DNA microarrays present an alternative way to study differences between closely related genomes. Advances in microarray-based approaches have enabled the main forms of genomic variation (amplifications, deletions, insertions, rearrangements and base-pair changes) to be detected using techniques that are readily performed in individual laboratories using simple experimental approaches.

PubMed Disclaimer

Figures

**Figure 1. Identifying copy number variation in genomes using array comparative genome hybridization.**
a | Copy number variation in the human genome. Whole-genome microarrays enable copy number variation to be compared across the human genome. The log₂ ratio of the test to reference signal for 22 autosomes and 2 sex chromosomes of the human genome are shown, chromosome by chromosome. The data are from a comparison between two male genomes hybridized to a BAC microarray, using a two-colour approach. b | Structural variation in the yeast genome that was identified using microarrays. In this example, the sixteen chromosomes (I–XVI) of *Saccharomyces cerevisiae* are shown. Blue circles represent the centromeres. The data from a PCR microarray containing ∼6,000 probes are smoothed over 5 adjacent probes. Black lines above and below each chromosome indicate a twofold change in copy number. This clonal isolate is a product of the experimental evolution of a diploid strain growing under glucose-limiting conditions. When this strain is compared with its ancestral strain using a two-colour microarray, it shows clear evidence of an amplification of the left arm of chromosome VII (red), resulting in a 3:2 ratio of DNA (log₂ ratio = 0.58), and loss of the right arm of chromosome XV (green), resulting in a 1:2 ratio of DNA (log₂ ratio = −1). Image for part a is reproduced courtesy of M. Hurles, Sanger Institute, UK. Image for part b uses data originally published in Ref. and is modified, with permission, from *Nature* Ref. © (2006) Macmillan Publishers Ltd.

**Figure 2. Detecting SNP variation using microarrays.**
a | Resequencing microarrays are designed with short oligonucleotides in which every possible variant is represented at the central position of a probe (shown in coloured font). At least four probes are used to interrogate each nucleotide position (as shown here for two adjacent positions), but often eight or more are used to include both strands and other small insertions and deletions. The probe sequence that is exactly complementary to the sample will result in the greatest hybridization efficiency (indicated by a green letter) and thus a comparison among all probes can be used to determine the nucleotide sequence of the sample. The coloured boxes indicate the relative intensity of hybridization at each probe — yellow being the highest intensity. b | In the absence of resequencing arrays, hybridization of the sample to candidate sequence probes can be used. Mismatches resulting from mutations in sample DNA will result in a lower hybridization efficiency compared with hybridization to a sample with complete sequence complementarity. This approach has the advantage of requiring far fewer probes and is often sufficient to detect sequence variation. If a mismatch is inferred then small-scale sequencing is necessary to identify the variant nucleotide. c | The effect of a SNP on hybridization is related to its corresponding position in a probe. More central positions result in the greatest decrease, whereas SNPs positioned at the end of probes are much less likely to result in a significant decrease in hybridization. d | It is possible to use hybridization data obtained from a mutation detection array to compute a likelihood that a particular site is a sequence variant with respect to the reference genome. This approach facilitates the comparison of related individuals at the sequence level, allowing rapid scanning of the genome. The diagram illustrates the analysis of a drug-resistant mutant in the budding yeast, *Saccharomyces cerevisiae*. Candidate SNPs are identified by a positive log likelihood value. In this case, a small number of candidate SNPs are detected throughout the genome; one is shown here on chromosome V, which is representative of the entire 13 Mb of the yeast genome. The peak highlighted in the inset is shown at higher resolution in the main figure. A single signal in the *CAN1* gene, which is known to confer resistance to the drug canavanine, was identified and subsequently verified using Sanger sequencing. *AVT2*, amino acid vacuolar transport 2; *NPR2*, nitrogen permease regulator 2. Images for parts c and d use data originally published in Ref. .

**Figure 3. Genome-wide mapping of loci by selective enrichment and detection using microarrays.**
One way of mapping insertion sequence variation is to isolate the insertion element and its immediately neighbouring DNA. Specific regions of the genome are isolated using either a capture probe method (as illustrated in the diagram) or a PCR-based method. This approach is suited to mapping the location in the genome of mobile genetic elements, which are notoriously difficult to characterize using whole-genome sequencing approaches. As shown in the figure, DNA is fragmented in the first step (a). Two separate reactions selectively enrich for the 5′ and 3′ ends of the insertion sequence using sequence-specific capture probes (b). The 5′-enriched and the 3′-enriched fractions are labelled with different fluorophores (Cy5 and Cy3, respectively) and then hybridized to a microarray using a two-colour protocol (c). Finally, insertion sites are mapped on the basis of a transition (indicated by an arrow in d) from positive to negative log₂ ratio, corresponding to sample that is enriched for DNA adjacent to the 5′ end of the insertion sequence and sample that is enriched for DNA adjacent to the 3′ end of the insertion sequence, respectively. The site of transition corresponds to the genomic location of the insertion sequence. Each red and blue bar represents a microarray probe in a contiguous region of the genome spanning ∼10 kb. The distance between each probe is ∼200 bp and therefore, in this case, the insertion site is detected between two probes corresponding to a mapping resolution of ∼200 bp.

See this image and copyright information in PMC

References

1. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc. Natl Acad. Sci. USA. 1977;74:5463–5467. doi: 10.1073/pnas.74.12.5463. - DOI - PMC - PubMed
1. Sanger F, et al. Nucleotide sequence of bacteriophage φX174 DNA. Nature. 1977;265:687–695. doi: 10.1038/265687a0. - DOI - PubMed
1. Lander ES, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35057062. - DOI - PubMed
1. Venter JC, et al. The sequence of the human genome. Science. 2001;291:1304–1351. doi: 10.1126/science.1058040. - DOI - PubMed
1. Venter JC, Levy S, Stockwell T, Remington K, Halpern A. Massive parallelism, randomness and genomic advances. Nature Genet. 2003;33:219–227. doi: 10.1038/ng1114. - DOI - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparing whole genomes using DNA microarrays

Affiliation

Comparing whole genomes using DNA microarrays

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources