Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jan;2(1):e9.
doi: 10.1371/journal.pgen.0020009. Epub 2006 Jan 27.

Genetic analysis of completely sequenced disease-associated MHC haplotypes identifies shuffling of segments in recent human history

Affiliations

Genetic analysis of completely sequenced disease-associated MHC haplotypes identifies shuffling of segments in recent human history

James A Traherne et al. PLoS Genet. 2006 Jan.

Abstract

The major histocompatibility complex (MHC) is recognised as one of the most important genetic regions in relation to common human disease. Advancement in identification of MHC genes that confer susceptibility to disease requires greater knowledge of sequence variation across the complex. Highly duplicated and polymorphic regions of the human genome such as the MHC are, however, somewhat refractory to some whole-genome analysis methods. To address this issue, we are employing a bacterial artificial chromosome (BAC) cloning strategy to sequence entire MHC haplotypes from consanguineous cell lines as part of the MHC Haplotype Project. Here we present 4.25 Mb of the human haplotype QBL (HLA-A26-B18-Cw5-DR3-DQ2) and compare it with the MHC reference haplotype and with a second haplotype, COX (HLA-A1-B8-Cw7-DR3-DQ2), that shares the same HLA-DRB1, -DQA1, and -DQB1 alleles. We have defined the complete gene, splice variant, and sequence variation contents of all three haplotypes, comprising over 259 annotated loci and over 20,000 single nucleotide polymorphisms (SNPs). Certain coding sequences vary significantly between different haplotypes, making them candidates for functional and disease-association studies. Analysis of the two DR3 haplotypes allowed delineation of the shared sequence between two HLA class II-related haplotypes differing in disease associations and the identification of at least one of the sites that mediated the original recombination event. The levels of variation across the MHC were similar to those seen for other HLA-disparate haplotypes, except for a 158-kb segment that contained the HLA-DRB1, -DQA1, and -DQB1 genes and showed very limited polymorphism compatible with identity-by-descent and relatively recent common ancestry (<3,400 generations). These results indicate that the differential disease associations of these two DR3 haplotypes are due to sequence variation outside this central 158-kb segment, and that shuffling of ancestral blocks via recombination is a potential mechanism whereby certain DR-DQ allelic combinations, which presumably have favoured immunological functions, can spread across haplotypes and populations.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Positional Distributions of Variations between PGF and QBL and COX and QBL
(A) Shows the distribution for PGF and QBL and (B) shows COX and QBL. MHC sequences were divided into 10-kb bins, and variations were calculated in each bin. Results are expressed as variations per 1 kb. Red and blue plots relate to SNP and DIP variations respectively. The sequence is interrupted by five gaps, shown as green vertical bars, where BACs encompassing these regions could not be identified from the clone library, which by comparison with PGF comprise a total of approximately 317 kb. The lengths and gene content of these gaps were as follows, from left to right: 159 kb including OR2U1P to OR12D2; 51 kb containing HCP5; 26 kb containing C6orf26, C6orf27, and the three exons of 3′ end of MSH5; 53 kb containing CREBL1, FKBPL, and six exons of the 5′ end of TNXB; and 27 kb containing HLA-DOB. These gaps do not represent large genomic deletions within the QBL haplotype since exonic sequence from selected genes within these regions were successfully amplified from QBL genomic DNA and sequenced to confirm their identity. The grey shaded area at the telomeric end of the map represents sequence for which overlap was not obtained and was therefore outside the area that was compared. Boundaries of the class I, II, and III regions are shown. The positions of RFP and KIFC1 that define the ends of the MHC haplotype sequencing project are indicated. Landmark genes are labelled in blue. Regions 1 and 2 are the RCCX module and the HLA-DRB region, respectively. The HLA-DRB3 and HLA-DQB3 region, which shows little variation between COX and QBL haplotypes, is shaded in orange.
Figure 2
Figure 2. Positional Distributions of Variations between COX and QBL in the HLA-DR Region
MHC sequences were divided into 10-kb bins, and variations were calculated in each bin. Results are expressed as variations per 1 kb. Red and blue plots relate to SNP and DIP variations respectively. Within a stretch of approximately 160 kb between HLA-DRB3 and HLA-DQB3, only 14 SNPs and six small DIPs, comprising 1 bp, 6 bp, 10 bp (five copies of a dinucleotide repeat), and 54 bp (two copies of 27 mer), were contained. None of the variations located to coding sequence or the defined promoter regions of the HLA class II genes [86]. Four 1-bp DIPs, labelled in grey, were identified between DRB1 and DQA1 where LR-PCR products were used to close a small gap resulting from clone deficit. These DIPs were located in polyA/T tracts in which the probability of Taq slippage in PCR products is much higher than in in-vivo amplified plasmid DNA such that their authenticity was questionable and they were excluded from analyses (Figure S2 shows one alignment of sequence traces with differing polyT tracts).
Figure 3
Figure 3. Haplotype Alignment of the Region Presenting Differing Variation Rates
The alignment covers the centromeric side of the DR–DQ 158-kb DNA segment (left half, low variation) and the adjacent DNA segment (increased variation). Coordinates refer to Chromosome 6 build NCBI35. Rows represent the allelic state for 26 single chromosomes with the same DRB1*1501-DQA1*0102-DQB1*0602 (DR15–DQ6) haplotype at successive SNPs which are represented by columns (A, red; C, blue; G, orange; and T, green). Identity is interrupted at a position perfectly matching with a recombination hotspot coordinate [5,53] represented as hotspot number 2 in Figure 4.
Figure 4
Figure 4. LD Structure around the HLA-DR Region
High-resolution view of the HLA-DR region, as represented by GOLDsurfer three-dimensional view of D′ values [81]. The position of the 158-kb segment shared by identical by descent between COX and QBL is shown by a dashed white line. High LD areas (red blocks) are separated by LD breaks. The first LD break (1) corresponds to a recombination hotspot mapped between NOTCH4 and C6orf10 in the class II–III boundary region. Another LD break (2) is visualized at another recombination hotspot centromeric of HLA-DQB1 at the boundary of the SNP desert between COX and QBL. This is followed centromerically by a further four LD breaks corresponding to recombination hotspots mapped at BRD2/HLA-DOA interval, within HLA-DMB, within TAP2 and HLA-DQB2/-DOB interval [5,51,53]. An asterisk (*) indicates a region of depleted SNP data, likely owing to substantial genotyping failure in an area with an extreme level of polymorphism.
Figure 5
Figure 5. Model of Haplotype Divergence over DR–DQ Region in Relation to Extended MHC Haplotypes
(A) Divergence of DR–DQ region over tens of millions of years [73]. (B) Transfer of divergent blocks into other haplotypes by recombination. This does not need a double crossover but could occur by single crossovers separated in time. (C) Relative expansion of ancestral DR–DQ haplotypic segments that may result from either positive selection or neutral processes. Black vertical stripes represent occasional SNP mutations occurring within the MHC including within the ancestral haplotypic segments. Small crosses represent crossovers occurring more frequently outside the conserved DR–DQ blocks relative to inside these blocks. However, allele or gene conversion may take place by closely spaced double crossovers, resulting in diversification of the peptide-binding groove without flanking recombination [87,88]. This model was predicted by Gaudieri et al. (1997) [63] based on incomplete sequence analysis. (D) Examples of contemporary MHC haplotypes containing ancestral DR–DQ segments (data provided by www.allelefrequencies.net).

Similar articles

Cited by

References

    1. Trowsdale J. The gentle art of gene arrangement: The meaning of gene clusters. Genome Biol 3: comment2002.1–comment2002.5. DOI: 10.1186/gb-2002–3–3-comment2002 2002. - PMC - PubMed
    1. Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature. 2001;409:928–933. - PubMed
    1. The International HapMap Project. The International HapMap Project. Nature. 2003;426:789–796. - PubMed
    1. Allcock RJ, Atrazhev AM, Beck S, de Jong PJ, Elliott JF, et al. The MHC haplotype project: A resource for HLA-linked association studies. Tissue Antigens. 2002;59:520–521. - PubMed
    1. Miretti MM, Walsh EC, Ke X, Delgado M, Griffiths M, et al. A high-resolution linkage-disequilibrium map of the human major histocompatibility complex and first generation of tag single-nucleotide polymorphisms. Am J Hum Genet. 2005;76:634–646. - PMC - PubMed

Publication types

Substances