Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jul;40(7):909-14.
doi: 10.1038/ng.172. Epub 2008 May 22.

Mouse segmental duplication and copy number variation

Affiliations

Mouse segmental duplication and copy number variation

Xinwei She et al. Nat Genet. 2008 Jul.

Abstract

Detailed analyses of the clone-based genome assembly reveal that the recent duplication content of mouse (4.94%) is now comparable to that of human (5.5%), in contrast to previous estimates from the whole-genome shotgun sequence assembly. However, the architecture of mouse and human genomes differs markedly: most mouse duplications are organized into discrete clusters of tandem duplications that show depletion of genes and transcripts and enrichment of long interspersed nuclear element (LINE) and long terminal repeat (LTR) retroposons. We assessed copy number variation of the C57BL/6J duplicated regions within 15 mouse strains previously used for genetic association studies, sequencing and the Mouse Phenome Project. We determined that over 60% of these base pairs are polymorphic among the strains (on average, there was 20 Mb of copy-number-variable DNA between different mouse strains). Our data suggest that different mouse strains show comparable, if not greater, copy number polymorphism when compared to human; however, such variation is more locally restricted. We show large and complex patterns of interstrain copy number variation restricted to large gene families associated with spermatogenesis, pregnancy, viviparity, pheromone signaling and immune response.

PubMed Disclaimer

Figures

Figure 1
Figure 1. (a) Mouse duplication and copy-number variant genomic landscape
Interchromosomal (red) and intrachromosomal (blue) duplications (>20 kb and >94% sequence identity) are shown for the C57BL/6J mouse genome. Copy-number polymorphic duplicated regions are flagged if two or more strains show a gain (green bars) or loss (pink bars) with respect to C57BL/6J. Brown bars highlight regions showing both gain and loss. Some of the largest duplicated and CNV regions are enumerated and labeled based on gene content. Mouse chromosomes 7, 12, 14, and X show the greatest preponderance of large duplication blocks. In the case of chromosome 7, the duplication blocks account for 32% of the first 50 Mb of that chromosome. (b) Mouse vs. human genome duplication pattern. Mouse and human intrachromosomal duplication patterns are compared for chromosome 7, 17, and X. Note: the human interspersed pattern of recent duplications when compared to the tandem clusters in mouse for the autosomes. A greater fraction of the mouse X chromosome is duplicated (12.8% in mouse vs. 7.8% in human). The X chromosome is syntenic between man and mouse. Human chr17 is syntenic to mouse chr11 and human chr7 is syntenic to mouse chr6 and chr5 based on UCSC genome browser human net track.
Figure 1
Figure 1. (a) Mouse duplication and copy-number variant genomic landscape
Interchromosomal (red) and intrachromosomal (blue) duplications (>20 kb and >94% sequence identity) are shown for the C57BL/6J mouse genome. Copy-number polymorphic duplicated regions are flagged if two or more strains show a gain (green bars) or loss (pink bars) with respect to C57BL/6J. Brown bars highlight regions showing both gain and loss. Some of the largest duplicated and CNV regions are enumerated and labeled based on gene content. Mouse chromosomes 7, 12, 14, and X show the greatest preponderance of large duplication blocks. In the case of chromosome 7, the duplication blocks account for 32% of the first 50 Mb of that chromosome. (b) Mouse vs. human genome duplication pattern. Mouse and human intrachromosomal duplication patterns are compared for chromosome 7, 17, and X. Note: the human interspersed pattern of recent duplications when compared to the tandem clusters in mouse for the autosomes. A greater fraction of the mouse X chromosome is duplicated (12.8% in mouse vs. 7.8% in human). The X chromosome is syntenic between man and mouse. Human chr17 is syntenic to mouse chr11 and human chr7 is syntenic to mouse chr6 and chr5 based on UCSC genome browser human net track.
Figure 2
Figure 2. Distribution of mouse versus human duplication pairwise alignments
The distance between segmental duplications was computed for the mouse (Build36) and the human (Build36) genome. All pairwise alignments >10 kb in length were binned into various categories. Tandem duplications that map within 5 Mb of one another constitute the bulk of mouse segmental duplications.
Figure 3
Figure 3. LINE and LTR enrichment within mouse segmental duplications
(a) We examined all large pairwise alignments (>20 kb) and computed the LINE and LTR content (in 500 bp windows; sliding increments of 100 bp) on either side of the alignment boundary as determined by whole-genome analysis comparison method. Segmental duplications are significantly enriched for both LINE and LTR repeats. We next examined all transition regions where there was at least 10 kb of unique sequence abutting segmental duplication (n=5325 alignments) and computed the (b) LINE content and (c) LTR content on either side of the unique/duplication transition boundary. LTR repeat sequences show specific enrichment for segmental duplications when compared to unique transition regions, while both the flanking unique and duplicated regions were enriched for LINE repeats.
Figure 4
Figure 4. Copy-number variable mouse segmental duplications
(a) Underlying array comparative genomic hybridization data are shown for four strains compared to C57BL/6J. SD and flanking regions (159 Mb) were ordered and collapsed according to chromosomal position (color). (b) An ∼170 kb segmental duplication region on chromosome 10 shown from the browser (http://mouseparalogy.gs.washington) in more detail for 15 different mouse strains. Significant (>1.5 standard deviation) decreases (red) and increases (green) are highlighted. At least six distinct regions (A-F) of copy-number variation can be discerned within the duplication block (WGAC=whole-genome assembly comparisons, WSSD=whole-genome shotgun sequence detection). Region A & E represent high-identity duplications of the interleukin 22 gene and, therefore, the arrayCGH signal represents the average differential of both regions and the arrayCGH patterns mirror one another. (c-f) Other examples of copy-number variable regions of segmental duplication depicted for nine strains, including: (c) the CCl and Il11ralpha duplication block (chr4:41,687,499-42,962,500), (d) a Spetex duplication block (chr14:2,966,668-6,566,667), (e) a vomeronasal receptor (V1r) duplication block (chr7:19,070,844-23,169,660), and (f) a Speer4d gene family duplicated region (chr5:14,842,936-15,240,435).
Figure 4
Figure 4. Copy-number variable mouse segmental duplications
(a) Underlying array comparative genomic hybridization data are shown for four strains compared to C57BL/6J. SD and flanking regions (159 Mb) were ordered and collapsed according to chromosomal position (color). (b) An ∼170 kb segmental duplication region on chromosome 10 shown from the browser (http://mouseparalogy.gs.washington) in more detail for 15 different mouse strains. Significant (>1.5 standard deviation) decreases (red) and increases (green) are highlighted. At least six distinct regions (A-F) of copy-number variation can be discerned within the duplication block (WGAC=whole-genome assembly comparisons, WSSD=whole-genome shotgun sequence detection). Region A & E represent high-identity duplications of the interleukin 22 gene and, therefore, the arrayCGH signal represents the average differential of both regions and the arrayCGH patterns mirror one another. (c-f) Other examples of copy-number variable regions of segmental duplication depicted for nine strains, including: (c) the CCl and Il11ralpha duplication block (chr4:41,687,499-42,962,500), (d) a Spetex duplication block (chr14:2,966,668-6,566,667), (e) a vomeronasal receptor (V1r) duplication block (chr7:19,070,844-23,169,660), and (f) a Speer4d gene family duplicated region (chr5:14,842,936-15,240,435).
Figure 4
Figure 4. Copy-number variable mouse segmental duplications
(a) Underlying array comparative genomic hybridization data are shown for four strains compared to C57BL/6J. SD and flanking regions (159 Mb) were ordered and collapsed according to chromosomal position (color). (b) An ∼170 kb segmental duplication region on chromosome 10 shown from the browser (http://mouseparalogy.gs.washington) in more detail for 15 different mouse strains. Significant (>1.5 standard deviation) decreases (red) and increases (green) are highlighted. At least six distinct regions (A-F) of copy-number variation can be discerned within the duplication block (WGAC=whole-genome assembly comparisons, WSSD=whole-genome shotgun sequence detection). Region A & E represent high-identity duplications of the interleukin 22 gene and, therefore, the arrayCGH signal represents the average differential of both regions and the arrayCGH patterns mirror one another. (c-f) Other examples of copy-number variable regions of segmental duplication depicted for nine strains, including: (c) the CCl and Il11ralpha duplication block (chr4:41,687,499-42,962,500), (d) a Spetex duplication block (chr14:2,966,668-6,566,667), (e) a vomeronasal receptor (V1r) duplication block (chr7:19,070,844-23,169,660), and (f) a Speer4d gene family duplicated region (chr5:14,842,936-15,240,435).
Figure 4
Figure 4. Copy-number variable mouse segmental duplications
(a) Underlying array comparative genomic hybridization data are shown for four strains compared to C57BL/6J. SD and flanking regions (159 Mb) were ordered and collapsed according to chromosomal position (color). (b) An ∼170 kb segmental duplication region on chromosome 10 shown from the browser (http://mouseparalogy.gs.washington) in more detail for 15 different mouse strains. Significant (>1.5 standard deviation) decreases (red) and increases (green) are highlighted. At least six distinct regions (A-F) of copy-number variation can be discerned within the duplication block (WGAC=whole-genome assembly comparisons, WSSD=whole-genome shotgun sequence detection). Region A & E represent high-identity duplications of the interleukin 22 gene and, therefore, the arrayCGH signal represents the average differential of both regions and the arrayCGH patterns mirror one another. (c-f) Other examples of copy-number variable regions of segmental duplication depicted for nine strains, including: (c) the CCl and Il11ralpha duplication block (chr4:41,687,499-42,962,500), (d) a Spetex duplication block (chr14:2,966,668-6,566,667), (e) a vomeronasal receptor (V1r) duplication block (chr7:19,070,844-23,169,660), and (f) a Speer4d gene family duplicated region (chr5:14,842,936-15,240,435).
Figure 4
Figure 4. Copy-number variable mouse segmental duplications
(a) Underlying array comparative genomic hybridization data are shown for four strains compared to C57BL/6J. SD and flanking regions (159 Mb) were ordered and collapsed according to chromosomal position (color). (b) An ∼170 kb segmental duplication region on chromosome 10 shown from the browser (http://mouseparalogy.gs.washington) in more detail for 15 different mouse strains. Significant (>1.5 standard deviation) decreases (red) and increases (green) are highlighted. At least six distinct regions (A-F) of copy-number variation can be discerned within the duplication block (WGAC=whole-genome assembly comparisons, WSSD=whole-genome shotgun sequence detection). Region A & E represent high-identity duplications of the interleukin 22 gene and, therefore, the arrayCGH signal represents the average differential of both regions and the arrayCGH patterns mirror one another. (c-f) Other examples of copy-number variable regions of segmental duplication depicted for nine strains, including: (c) the CCl and Il11ralpha duplication block (chr4:41,687,499-42,962,500), (d) a Spetex duplication block (chr14:2,966,668-6,566,667), (e) a vomeronasal receptor (V1r) duplication block (chr7:19,070,844-23,169,660), and (f) a Speer4d gene family duplicated region (chr5:14,842,936-15,240,435).
Figure 4
Figure 4. Copy-number variable mouse segmental duplications
(a) Underlying array comparative genomic hybridization data are shown for four strains compared to C57BL/6J. SD and flanking regions (159 Mb) were ordered and collapsed according to chromosomal position (color). (b) An ∼170 kb segmental duplication region on chromosome 10 shown from the browser (http://mouseparalogy.gs.washington) in more detail for 15 different mouse strains. Significant (>1.5 standard deviation) decreases (red) and increases (green) are highlighted. At least six distinct regions (A-F) of copy-number variation can be discerned within the duplication block (WGAC=whole-genome assembly comparisons, WSSD=whole-genome shotgun sequence detection). Region A & E represent high-identity duplications of the interleukin 22 gene and, therefore, the arrayCGH signal represents the average differential of both regions and the arrayCGH patterns mirror one another. (c-f) Other examples of copy-number variable regions of segmental duplication depicted for nine strains, including: (c) the CCl and Il11ralpha duplication block (chr4:41,687,499-42,962,500), (d) a Spetex duplication block (chr14:2,966,668-6,566,667), (e) a vomeronasal receptor (V1r) duplication block (chr7:19,070,844-23,169,660), and (f) a Speer4d gene family duplicated region (chr5:14,842,936-15,240,435).

References

    1. Cheung J, et al. Recent segmental and gene duplications in the mouse genome. Genome Biol. 2003;4:R47. - PMC - PubMed
    1. Bailey JA, Church DM, Ventura M, Rocchi M, Eichler EE. Analysis of segmental duplications and genome assembly in the mouse. Genome Res. 2004;14:789–801. - PMC - PubMed
    1. Bailey JA, Eichler EE. Genome-wide detection of segmental duplication within mammalian organisms. In: Ebert J, editor. Proceedings of the 68th Cold Spring Harbor Symposium: Genome of Homo sapiens; New York: Cold Spring Harbor Press; 2003. - PubMed
    1. She X, et al. Shotgun sequence assembly and recent segmental duplications within the human genome. Nature. 2004;431:927–30. - PubMed
    1. She X, et al. A preliminary comparative analysis of primate segmental duplications shows elevated substitution rates and a great-ape expansion of intrachromosomal duplications. Genome Res. 2006;16:576–83. - PMC - PubMed

Publication types

Associated data