Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb 15;26(4):565-7.
doi: 10.1093/bioinformatics/btp693. Epub 2009 Dec 18.

Copy number variant detection in inbred strains from short read sequence data

Affiliations

Copy number variant detection in inbred strains from short read sequence data

Jared T Simpson et al. Bioinformatics. .

Abstract

We have developed an algorithm to detect copy number variants (CNVs) in homozygous organisms, such as inbred laboratory strains of mice, from short read sequence data. Our novel approach exploits the fact that inbred mice are homozygous at virtually every position in the genome to detect CNVs using a hidden Markov model (HMM). This HMM uses both the density of sequence reads mapped to the genome, and the rate of apparent heterozygous single nucleotide polymorphisms, to determine genomic copy number. We tested our algorithm on short read sequence data generated from re-sequencing chromosome 17 of the mouse strains A/J and CAST/EiJ with the Illumina platform. In total, we identified 118 copy number variants (43 for A/J and 75 for CAST/EiJ). We investigated the performance of our algorithm through comparison to CNVs previously identified by array-comparative genomic hybridization (array CGH). We performed quantitative-PCR validation on a subset of the calls that differed from the array CGH data sets.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
(A) Plot of sequencing depth across a one megabase region of A/J chromosome 17 clearly shows both a region of 3-fold increased copy number (30.6–31.1 Mb) and a region of decreased copy number (at 31.3 Mb). The solid black line above the depth plot indicates the called copy number gain and the solid black line below the plot indicates the called copy number loss. (B) Plot of the heterozygous SNP rate for the same region showing the high number of apparent heterozygous SNPs associated with the copy number gain.

References

    1. Cahan P, et al. wuHMM: a robust algorithm to detect DNA copy number variation using long oligonucleotide microarray data. Nucleic Acids Res. 2008;36:e41. - PMC - PubMed
    1. Cahan P, et al. The impact of copy number variation on local gene expression in mouse hematopoietic stem and progenitor cells. Nat. Genet. 2009;41:430–437. - PMC - PubMed
    1. Conrad DF, et al. Origins and functional impact of copy number variation in the human genome. Nature. 2009 [Epub ahead of print, doi: 1038/nature08516, October 7, 2009] - PMC - PubMed
    1. Cutler G, et al. Significant gene content variation characterizes the genomes of inbred mouse strains. Genome Res. 2007;17:1743–1754. - PMC - PubMed
    1. Durbin R, et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge, UK; New York: Cambridge University Press; 1998. Markov chains and hidden Markov models; p. 356.

Publication types