Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004 Jun;14(6):1176-87.
doi: 10.1101/gr.2188104. Epub 2004 May 12.

Complete MHC haplotype sequencing for common disease gene mapping

Affiliations
Comparative Study

Complete MHC haplotype sequencing for common disease gene mapping

C Andrew Stewart et al. Genome Res. 2004 Jun.

Abstract

The future systematic mapping of variants that confer susceptibility to common diseases requires the construction of a fully informative polymorphism map. Ideally, every base pair of the genome would be sequenced in many individuals. Here, we report 4.75 Mb of contiguous sequence for each of two common haplotypes of the major histocompatibility complex (MHC), to which susceptibility to >100 diseases has been mapped. The autoimmune disease-associated-haplotypes HLA-A3-B7-Cw7-DR15 and HLA-A1-B8-Cw7-DR3 were sequenced in their entirety through a bacterial artificial chromosome (BAC) cloning strategy using the consanguineous cell lines PGF and COX, respectively. The two sequences were annotated to encompass all described splice variants of expressed genes. We defined the complete variation content of the two haplotypes, revealing >18,000 variations between them. Average SNP densities ranged from less than one SNP per kilobase to >60. Acquisition of complete and accurate sequence data over polymorphic regions such as the MHC from large-insert cloned DNA provides a definitive resource for the construction of informative genetic maps, and avoids the limitation of chromosome regions that are refractory to PCR amplification.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Representation of the PGF and COX haplotypes in the VEGA genome annotation browser. (A) Title screen showing the availability of annotated MHC haplotypes. (B,C) Manually curated gene structures in the RCCX region (see Fig. 3 and accompanying text). The PGF haplotype has two copies of the gene for complement component C4 (C4A and C4B), whereas COX has only one (C4B).
Figure 2
Figure 2
Positional distribution of variations between COX and PGF MHC sequences. MHC sequences were divided into 10-kb bins, and variations were calculated in each bin. Results are expressed as variations per 1 kb. A locus is defined as all genomic DNA between the 5′-start of a gene to the 3′-end. Boundaries of the class I, II, and III regions are shown. The positions of genes RFP and KIFC1 that define the ends of the MHC haplotype sequencing project are indicated in black. Other genes with five or more SNPs between the haplotypes are labeled in blue. HLA-C is also labeled. Regions of interest are labeled above the graph. Regions 1 and 2 are the RCCX module and the HLA-DRB region, respectively. (*) The TAP to HLA-DMB region, which shows little variation between haplotypes. Blue bars B, D, and F label regions in which variation is thought to be classical MHC class I and class II gene-associated.
Figure 3
Figure 3
Dot-matrix comparison of the PGF and COX sequences spanning the RCCX (A) and HLA-DRB (B) regions. The X-axes represent the PGF contig and encoded genes and the Y-axes display those of COX. Axis numbering is in kilobases. Coding genes are labeled in black and pseudogenes in gray. (A) No variations were calculated from comparison of the duplicated region (boxed region of homology) of PGF with COX. Striped boxes represent endogenous retrovirus sequences that identify long (L) C4 genes. The dot-matrix analysis was performed using sequences taken for PGF from contig position 26,001 to 180,559 of AL645922, and for COX from 12,001 of AL662849 to 55,541 of AL662828. (B) No PGF/COX variations were calculated between sequences bordered by open circles through which the gene content differs between haplotypes. A more detailed comparison of the HLA-DRB1 loci with exon locations is shown. The dot-matrix analysis was performed using sequences taken for PGF from contig position 128,102 of AL662796 to 27,557 of AL662789, and for COX from 130,001 of AL670296 to 108,819 of AL662842.
Figure 4
Figure 4
Dot-matrix comparisons of PGF and COX sequences spanning selected indels. (A) Complex inversion centromeric of C6orf205 and (B) multiple indels located between the HLA-DQA1 and HLA-DQB1 loci. PGF is represented on the X-axis and COX on the Y-axis. Scales are in kilobases. Repetitive elements identified by RepeatMasker are shown. Sequence is taken from (A) AL669830 134,100 to 138,400 and AL663093 82,300 to 89,300 and (B) AL662789 57,100 to 70,800 and AL731683 8,700 to 17,100.

References

    1. Allcock, R.J., Atrazhev, A.M., Beck, S., de Jong, P.J., Elliott, J.F., Forbes, S., Halls, K., Horton, R., Osoegawa, K., Rogers, J., et al. 2002. The MHC haplotype project: a resource for HLA-linked association studies. Tissue Antigens 59: 520–521. - PubMed
    1. Altshuler, D., Pollara, V.J., Cowles, C.R., Van Etten, W.J., Baldwin, J., Linton, L., and Lander, E.S. 2000. An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 407: 513–516. - PubMed
    1. Ashurst, J.L. and Collins, J.E. 2003. Gene annotation: Prediction and testing. Annu. Rev. Genomics Hum. Genet. 4: 69–88. - PubMed
    1. Balendran, N., Clough, R.L., Arguello, J.R., Barber, R., Veal, C., Jones, A.B., Rosbotham, J.L., Little, A.M., Madrigal, A., Barker, J.N., et al. 1999. Characterization of the major susceptibility region for psoriasis at chromosome 6p21.3. J. Invest. Dermatol. 113: 322–328. - PubMed
    1. Bankier, A.T., Weston, K.M., and Barrell, B.G. 1987. Random cloning and sequencing by the M13/dideoxynucleotide chain termination method. Methods Enzymol. 155: 51–93. - PubMed

WEB SITE REFERENCES

    1. http://bacpac.chori.org/; BACPAC resources center home page.
    1. http://genome.wustl.edu/gsc/Overview/finrules/hgfinrules.html; Human Genome finishing rules at the Genome Sequencing Center, Washington University Medical School.
    1. http://genome.wustl.edu/tools/?overgo=1; Overgo Maker script at the Genome Sequencing Center, Washington University Medical School.
    1. http://repeatmasker.genome.washington.edu/; The RepeatMasker server, University of Washington.
    1. http://vega.sanger.ac.uk/; Vertebrate Genome Annotation (VEGA) database browser.

Publication types

MeSH terms