Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Feb;15(2):195-204.
doi: 10.1101/gr.3302705. Epub 2005 Jan 14.

Interchromosomal segmental duplications of the pericentromeric region on the human Y chromosome

Affiliations

Interchromosomal segmental duplications of the pericentromeric region on the human Y chromosome

Stefan Kirsch et al. Genome Res. 2005 Feb.

Abstract

Basic medical research critically depends on the finished human genome sequence. Two types of gaps are known to exist in the human genome: those associated with heterochromatic sequences and those embedded within euchromatin. We identified and analyzed a euchromatic island within the pericentromeric repeats of the human Y chromosome. This 450-kb island, although not recalcitrant to subcloning and present in 100 tested males from different ethnic origins, was not detected and is not contained within the published Y chromosomal sequence. The entire 450-kb interval is almost completely duplicated and consists predominantly of interchromosomal rather than intrachromosomal duplication events that are usually prevalent on the Y chromosome. We defined the modular structure of this interval and detected a total of 128 underlying pairwise alignments (>/=90% and >/=1 kb in length) to various autosomal pericentromeric and ancestral pericentromeric regions. We also analyzed the putative gene content of this region by a combination of in silico gene prediction and paralogy analysis. We can show that even in this exceptionally duplicated region of the Y chromosome, eight putative genes with open reading frames reside, including fusion transcripts formed by the splicing of exons from two different duplication modules as well as members of the homeobox gene family DUX.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Location of a euchromatic island flanked by centromeric satellite 3 repeats on the long arm of the Y chromosome. (A) Minimum tiling path for sequencing the Y chromosome as published by Tilford et al. (2001). (B) Enlarged view of the genomic region encompassing the centromere and satellite 3 repeats (Tilford et al. 2001). (C) Illustration of the P1-derived artifical chromosome (PAC) and bacterial artificial chromosome (BAC) clones assembled into the pericentromeric Yq11 contig. Blue lines indicate name and sequence length of respective clones. Clone names include library origin; accession nos. are in parentheses. PAC RP1-85D24 extends 2 kb into the satellite 3 sequence block forming a constant part of the human Y chromosome centromere (Tyler-Smith 1987). The overlap sizes of the clones are as follows: RP1-85D24 ↔ RP11-131M06 71,279 bp; RP11-131M06 ↔ RP11-886I11 33,248 bp; RP11-886I11 ↔ RP11-295P22 10,705 bp. RP11-295P22 overlaps by 10,417 bp with RP11-322K23, the most centromeric clone presented in Tilford et al. (2001). The distal half of RP11-295P22 consists exclusively of satellite 3 repeat sequence. Subtracting satellite 3 segments from the entire 554-kb sequence discloses previously unknown 450 kb of euchromatic DNA sequence.
Figure 2.
Figure 2.
Human Y chromosome pericentromeric segmental duplications. A simplified version of the 554-kb sequenced contig is shown in the middle. The two large boxes represent the genomic segments composed of interchromosomal duplications, the small box that of intrachromosomal duplications. Other chromosomes are represented as horizontal black lines, above and below. Centromeres and acrocentric p arms are indicated as tiny boxes. All diagonal lines represent pairwise sequence comparisons ≥10 kb of DNA. The majority of Y pericentromeric duplications localize to the pericentromeric regions of autosomes. On chromosomes 2 and 4, the ancestral pericentromeric regions also show significant pairwise alignments. The coordinates are based on the published NCBI human genome assembly (January 2004, Build 34, Vers. 2).
Figure 3.
Figure 3.
Summary scheme depicting homologies between the pericentromeric region of Yq11 and other chromosomes. (A) The black horizontal bar shows the 554-kb sequenced region. Segments with interchromosomal (red boxes) and intrachromosomal (blue boxes) duplications are indicated. (B) All colored bars represent sequence homologies between the Y-chromosomal pericentromeric region and autosomes as determined by standard whole-genome analysis comparison (WGAC). Each color indicates a specific degree of homology: red, 100%-99%; orange, 99%-98%; yellow, 98%-97%; green, 97%-96%; blue, 96%-95%; indigo, 95%-94%; and violet, 93%. Each bar is preceded by the corresponding chromosome number. Bars that correspond to different chromosomes are indicated separately. Paralogies to sequenced genomic regions not assigned to a specific chromosome are summarized as chrUn_random. (C) Two-color FISH of human Y-chromosomal PAC (85D24) and BAC (131M06, 886I11, 295P22) clones (labeled in green) to human male metaphase spreads is shown below. Centromeres of chromosomes 4 and the constitutive heterochromatic region of the long arm of chromosomes 9 are labeled in red. Metaphases shown in a-d reflect the most proximal (a) to distal (d) order in the contig. Chromosomes with specific hybridization signals are tagged, respectively. The in silico identified paralogous segments and the chromosomal band localizations of the specific signals are listed in Table 1.
Figure 4.
Figure 4.
Putative gene content of the euchromatic island in the pericentromeric region of the human Y. (A) The structure of the pericentromeric region of Yq11 is presented as a horizontal line with boxes representing segmental duplications. (B) Genomic properties of the Yq11 region: from top to bottom—(G+C) content, CpG islands, interspersed repeats including Alu, LINE, and HERV, satellite sequences including 5-bp and 68-bp repeats. (C) Only Y-specific sequences corresponding to exons of known autosomal genes or EST clusters with exon/intron boundaries are shown. Exons of the identified genes or pseudogenes identified are drawn to scale. For the ease of illustration, genic sequences were spread over four horizontal lines. (D) Large arrows indicate the predicted transcriptional direction. The GenBank accession nos. for sequences shown are Hs.252460, AF038169 (BC043584), FLJ42128 (AK124122), LOC339742 (BC045732), ASNS (NM_133436), FLJ39633 (AK096952), C21orf81 (AF426257), FLJ35140 (AK092459), FLJ00310 (AK090412), THC 1666755, DUX1 (NM_012146), PABPC1 (NM_002568), LOC150159 (NM_139173), ARP3β (NM_020445), FKSG74 (AY026352), CHEK2 (BC004207), TRIM43 (BC015353), and MGC32713 (BC034141). The state of each genic sequence is characterized as follows: formula imageGene with intact ORF (7, 11-14, 17a); formula imageEST cluster with unidentified ORF (1,10); formula imagePartial gene (2, 9, 16, 18, 20); formula imageDegenerated processed pseudogene (3, 4, 5, 6, 8, 15, 17, 19).
Figure 5.
Figure 5.
Sequential organization of the DUXY locus in the pericentromeric region of Yq11. The orientation of centromere and telomere is shown at the top. Four copies of the DUX gene family (DUXY1-Y4) are clustered as an imperfect tandem repeat within a genomic segment of 30,323 bp. The transcriptional orientation of each copy is indicated by an arrow. Each DUXY copy is enclosed by repeated elements of tandemly repeated simple sequences (68-bp satellite and LSAU repeat; see legend in figure). Whereas the two types of LSAU repeats are constant in size (120-122 bp and 494-497 bp), the 68-bp satellite sequence is highly variable (2004, 1921, 3707, and 7538 bp). At the distal end of the most telomeric 68-bp satellite block, an Alu repeat has been integrated. The centromeric boundary of the DUXY locus is defined by a block of 5-bp satellite sequence, whereas the telomeric boundary is defined by a MER7 repeat.
Figure 6.
Figure 6.
Comparison of predicted amino-acid sequence of cDNAs and paralogous genomic copies of double homeobox DUX-like genes. DUX1, DUX2, DUX3, DUX4, DUX5, and DUX10 represent human double homeobox-containing genes from 4q35. All genes consist only of a single exon. The color code corresponds to the CLUSTALW default for amino-acid sequence comparison. The boxes indicate the 60-amino-acid conserved homeodomain. Analogous to the DUX gene family member DUX2, none of the Y-derived DUX family members contain a complete copy of the second homeodomain (Homeobox II). The location of a 1-bp deletion in DUXY1 relative to all other family members is indicated, resulting in a frameshift and a C-terminally altered amino acid sequence (purple). We resequenced the DUXY1 copy from a PCR product amplified from BAC RP11-886I11 (AC134882) and a normal male individual and confirmed the accuracy of the 1-bp deletion. Black stars indicate stop codons.

Similar articles

Cited by

References

    1. Arfin, S.M., Cirullo, R.E., Arredondo-Vega, F.X., and Smith, M. 1983. Assignment of structural gene for asparagine synthetase to human chromosome 7. Somatic Cell Genet. 9: 517-531. - PubMed
    1. Avarello, R., Pedicini, A., Caiulo, A., Zuffardi, O., and Fraccaro, M. 1992. Evidence for an ancestral alphoid domain on the long arm of human chromosome 2. Hum. Genet. 89: 24-49. - PubMed
    1. Bailey, J.A., Yavor, A.M., Massa, H.F., Trask, B.J., and Eichler, E.E. 2001. Segmental duplications: Organization and impact within the current human genome project assembly. Genome Res. 11: 1005-1017. - PMC - PubMed
    1. Bailey, J.A., Yavor, A.M., Viggiano, L., Misceo, D., Horvath, J.E., Archidiacono, N., Schwartz, S., Rocchi, M., and Eichler, E.E. 2002. Human-specific duplication and mosaic transcripts: The recent paralogous structure of chromosome 22. Am J. Hum. Genet. 70: 83-100. - PMC - PubMed
    1. Baldini, A., Ried, T., Shridhar, V., Ogura, K., D'Aiuto, L., Rocchi, M., and Ward, D.C. 1993. An alphoid DNA sequence conserved in all human and great ape chromosomes: Evidence for ancient centromeric sequences at human chromosomal regions 2q21 and 9q13. Hum. Genet. 90: 577-583. - PubMed

Publication types