Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Sep 25:8:339.
doi: 10.1186/1471-2164-8-339.

Interspecies hybridization on DNA resequencing microarrays: efficiency of sequence recovery and accuracy of SNP detection in human, ape, and codfish mitochondrial DNA genomes sequenced on a human-specific MitoChip

Affiliations

Interspecies hybridization on DNA resequencing microarrays: efficiency of sequence recovery and accuracy of SNP detection in human, ape, and codfish mitochondrial DNA genomes sequenced on a human-specific MitoChip

Sarah M C Flynn et al. BMC Genomics. .

Abstract

Background: Iterative DNA "resequencing" on oligonucleotide microarrays offers a high-throughput method to measure intraspecific iodiversity, one that is especially suited to SNP-dense gene regions such as vertebrate mitochondrial (mtDNA) genomes. However, costs of single-species design and microarray fabrication are prohibitive. A cost-effective, multi-species strategy is to hybridize experimental DNAs from diverse species to a common microarray that is tiled with oligonucleotide sets from multiple, homologous reference genomes. Such a strategy requires that cross-hybridization between the experimental DNAs and reference oligos from the different species not interfere with the accurate recovery of species-specific data. To determine the pattern and limits of such interspecific hybridization, we compared the efficiency of sequence recovery and accuracy of SNP identification by a 15,452-base human-specific microarray challenged with human, chimpanzee, gorilla, and codfish mtDNA genomes.

Results: In the human genome, 99.67% of the sequence was recovered with 100.0% accuracy. Accuracy of SNP identification declines log-linearly with sequence divergence from the reference, from 0.067 to 0.247 errors per SNP in the chimpanzee and gorilla genomes, respectively. Efficiency of sequence recovery declines with the increase of the number of interspecific SNPs in the 25b interval tiled by the reference oligonucleotides. In the gorilla genome, which differs from the human reference by 10%, and in which 46% of these 25b regions contain 3 or more SNP differences from the reference, only 88% of the sequence is recoverable. In the codfish genome, which differs from the reference by > 30%, less than 4% of the sequence is recoverable, in short islands > or = 12b that are conserved between primates and fish.

Conclusion: Experimental DNAs bind inefficiently to homologous reference oligonucleotide sets on a re-sequencing microarray when their sequences differ by more than a few percent. The data suggest that interspecific cross-hybridization will not interfere with the accurate recovery of species-specific data from multispecies microarrays, provided that the species' DNA sequences differ by > 20% (mean of 5b differences per 25b oligo). Recovery of DNA sequence data from multiple, distantly-related species on a single multiplex gene chip should be a practical, highly-parallel method for investigating genomic biodiversity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
SNP density per 100 bps between the tiled human mtDNA sequence and the chimpanzee and gorilla mtDNA genomes, as identified by dideoxy DNA sequencing. SNP densities were calculated in a sliding window starting at Position 51 of the tiled sequence.
Figure 2
Figure 2
Number of errors at various dS/N cutoffs. The number of errors is the number of incorrect SNP identifications in chimpanzee (diamonds) and gorilla (squares).
Figure 3
Figure 3
SNP density versus mismatch density per 25 bps in chimpanzee and gorilla mtDNA genomes. Bubbles are proportional to the number of events at each point.
Figure 4
Figure 4
Experimental DNA binding of human and Atlantic Cod (Gadus morhua) mtDNA hybridized to a human-mtDNA-specific resequencing microarray.
Figure 5
Figure 5
Phylogenetic relationships with and among Gorilla, Pan, and Homo, based on mitochondrial DNA genome sequences (without D-loops). The single minimum-length tree had a length of 2828. All nodes are supported in 100% of 10,000 bootstrap replications, Sequences marked (") are from the present paper. The unmarked sequences are from GenBank (Gorilla [NC_001645], Pan troglodytes [NC_001643], P. paniscus [NC_001644], and Homo [revised Cambridge Reference Sequence (rCRS): J01415.1]). The Homo sequence marked (') is from the individual (GenBank AF347008) identified in ref (12) as most divergent from the rCRS.
Figure 6
Figure 6
High-confidence error rate (E: squares) and SNP detection rate (circles) versus pairwise sequence divergence (D) for human, chimpanzee, and gorilla mtDNA genomes. The equation of the trend line is log(E) = (19.6)(D) – 3.9.

References

    1. Wang DG, Fan JB, Siao CJ, Berno A, Young P, Sapolsky R, Ghandour G, Perkins N, Winchester E, Spencer J, Kruglyak L, Stein L, Hsie L, Topaloglou T, Hubbell E, Robinson E, Mittmann M, Morris MS, Shen N, Kilburn D, Rioux J, Nusbaum C, Rozen S, Hudson TJ, Lipshutz R, Chee M, Lander ES. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science. 1998;280:1077–1082. doi: 10.1126/science.280.5366.1077. - DOI - PubMed
    1. Arbeitman MN, Furlon EE, Imam F, Johnson E, Null BH, Baker BS, Krasnow MA, Scott MP, Davis RW, White KP. Gene expression during the life cycle of Drosophila melanogaster . Science. 2002;297:2270–2275. doi: 10.1126/science.1072152. - DOI - PubMed
    1. Ross DT, Scherf U, Eisen MB, Perou CM, Rees C, Spellman P, Iyer V, Jeffrey SS, Van de Rijn M, Waltham M, Pergamenschikov A, Lee JC, Lashkari D, Shalon D, Myers TG, Weinstein JN, Botstein D, Brown PO. Systematic variation in gene expression patterns in human cancer cell lines. Nature Genet. 2000;24:227–235. doi: 10.1038/73432. - DOI - PubMed
    1. Weigelt B, Glas AM, Wessels LF, Witteveen AT, Peterse JL, van't Veer LJ. Gene expression profiles of primary breast tumors maintained in distant metastases. PNAS. 2003;100:15901–15905. doi: 10.1073/pnas.2634067100. - DOI - PMC - PubMed
    1. Tiffin N, Adie E, Turner F, Brunner HG, van Driel MA, Oti M, Lopez-Bigas N, Ouzounis C, Perez-Iratxeta C, Andrade-Navarro MA, Adeyemo A, Patti ME, Semple CA, Hide W. Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nuc Acids Res. 2006;34:3067–3081. doi: 10.1093/nar/gkl381. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances