CGHScan: finding variable regions using high-density microarray comparative genomic hybridization data
- PMID: 16638145
- PMCID: PMC1464128
- DOI: 10.1186/1471-2164-7-91
CGHScan: finding variable regions using high-density microarray comparative genomic hybridization data
Abstract
Background: Comparative genomic hybridization can rapidly identify chromosomal regions that vary between organisms and tissues. This technique has been applied to detecting differences between normal and cancerous tissues in eukaryotes as well as genomic variability in microbial strains and species. The density of oligonucleotide probes available on current microarray platforms is particularly well-suited for comparisons of organisms with smaller genomes like bacteria and yeast where an entire genome can be assayed on a single microarray with high resolution. Available methods for analyzing these experiments typically confine analyses to data from pre-defined annotated genome features, such as entire genes. Many of these methods are ill suited for datasets with the number of measurements typical of high-density microarrays.
Results: We present an algorithm for analyzing microarray hybridization data to aid identification of regions that vary between an unsequenced genome and a sequenced reference genome. The program, CGHScan, uses an iterative random walk approach integrating multi-layered significance testing to detect these regions from comparative genomic hybridization data. The algorithm tolerates a high level of noise in measurements of individual probe intensities and is relatively insensitive to the choice of method for normalizing probe intensity values and identifying probes that differ between samples. When applied to comparative genomic hybridization data from a published experiment, CGHScan identified eight of nine known deletions in a Brucella ovis strain as compared to Brucella melitensis. The same result was obtained using two different normalization methods and two different scores to classify data for individual probes as representing conserved or variable genomic regions. The undetected region is a small (58 base pair) deletion that is below the resolution of CGHScan given the array design employed in the study.
Conclusion: CGHScan is an effective tool for analyzing comparative genomic hybridization data from high-density microarrays. The algorithm is capable of accurately identifying known variable regions and is tolerant of high noise and varying methods of data preprocessing. Statistical analysis is used to define each variable region providing a robust and reliable method for rapid identification of genomic differences independent of annotated gene boundaries.
Figures



Similar articles
-
Comparative whole-genome hybridization reveals genomic islands in Brucella species.J Bacteriol. 2004 Aug;186(15):5040-51. doi: 10.1128/JB.186.15.5040-5051.2004. J Bacteriol. 2004. PMID: 15262941 Free PMC article.
-
Supervised Lowess normalization of comparative genome hybridization data--application to lactococcal strain comparisons.BMC Bioinformatics. 2008 Feb 11;9:93. doi: 10.1186/1471-2105-9-93. BMC Bioinformatics. 2008. PMID: 18267014 Free PMC article.
-
SW-ARRAY: a dynamic programming solution for the identification of copy-number changes in genomic DNA using array comparative genome hybridization data.Nucleic Acids Res. 2005 Jun 16;33(11):3455-64. doi: 10.1093/nar/gki643. Print 2005. Nucleic Acids Res. 2005. PMID: 15961730 Free PMC article.
-
The discovery of microdeletion syndromes in the post-genomic era: review of the methodology and characterization of a new 1q41q42 microdeletion syndrome.Genet Med. 2007 Sep;9(9):607-16. doi: 10.1097/gim.0b013e3181484b49. Genet Med. 2007. PMID: 17873649 Review.
-
Comprehensive validation of array comparative genomic hybridization platforms: how much is enough?Genet Med. 2007 Sep;9(9):632-41. doi: 10.1097/gim.0b013e31814629fc. Genet Med. 2007. PMID: 17873652 Review.
Cited by
-
The current MLVA typing scheme for Enterococcus faecium is less discriminatory than MLST and PFGE for epidemic-virulent, hospital-adapted clonal types.BMC Microbiol. 2007 Apr 10;7:28. doi: 10.1186/1471-2180-7-28. BMC Microbiol. 2007. PMID: 17425779 Free PMC article.
-
Genomic acquisition of a capsular polysaccharide virulence cluster by non-pathogenic Burkholderia isolates.Genome Biol. 2010;11(8):R89. doi: 10.1186/gb-2010-11-8-r89. Epub 2010 Aug 27. Genome Biol. 2010. PMID: 20799932 Free PMC article.
-
Deciphering the hybridisation history leading to the Lager lineage based on the mosaic genomes of Saccharomyces bayanus strains NBRC1948 and CBS380.PLoS One. 2011;6(10):e25821. doi: 10.1371/journal.pone.0025821. Epub 2011 Oct 5. PLoS One. 2011. PMID: 21998701 Free PMC article.
-
ADaCGH: A parallelized web-based application and R package for the analysis of aCGH data.PLoS One. 2007 Aug 15;2(8):e737. doi: 10.1371/journal.pone.0000737. PLoS One. 2007. PMID: 17710137 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources