Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 May 28:14:355.
doi: 10.1186/1471-2164-14-355.

Phase-defined complete sequencing of the HLA genes by next-generation sequencing

Affiliations

Phase-defined complete sequencing of the HLA genes by next-generation sequencing

Kazuyoshi Hosomichi et al. BMC Genomics. .

Abstract

Background: The human leukocyte antigen (HLA) region, the 3.8-Mb segment of the human genome at 6p21, has been associated with more than 100 different diseases, mostly autoimmune diseases. Due to the complex nature of HLA genes, there are difficulties in elucidating complete HLA gene sequences especially HLA gene haplotype structures by the conventional sequencing method. We propose a novel, accurate, and cost-effective method for generating phase-defined complete sequencing of HLA genes by using indexed multiplex next generation sequencing.

Results: A total of 33 HLA homozygous samples, 11 HLA heterozygous samples, and 3 parents-child families were subjected to phase-defined HLA gene sequencing. We applied long-range PCR to amplify six HLA genes (HLA-A, -C, -B, DRB1, -DQB1, and -DPB1) followed by transposase-based library construction and multiplex sequencing with the MiSeq sequencer. Paired-end reads (2 × 250 bp) derived from the sequencer were aligned to the six HLA gene segments of UCSC hg19 allowing at most 80 bases mismatch. For HLA homozygous samples, the six amplicons of an individual were pooled and simultaneously sequenced and mapped as an individual-tagging method. The paired-end reads were aligned to corresponding genes of UCSC hg19 and unambiguous, continuous sequences were obtained. For HLA heterozygous samples, each amplicon was separately sequenced and mapped as a gene-tagging method. After alignments, we detected informative paired-end reads harboring SNVs on both forward and reverse reads that are used to separate two chromosomes and to generate two phase-defined sequences in an individual. Consequently, we were able to determine the phase-defined HLA gene sequences from promoter to 3'-UTR and assign up to 8-digit HLA allele numbers, regardless of whether the alleles are rare or novel. Parent-child trio-based sequencing validated our sequencing and phasing methods.

Conclusions: Our protocol generated phased-defined sequences of the entire HLA genes, resulting in high resolution HLA typing and new allele detection.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Size selection of the Nextera DNA libraries by agarose gel size selection. (A) Electropherogram of DNA library analyzed by 2100 Bioanalyzer. The library size of the Nextera DNA Sample Prep Kits was 150 bp to more than 10 kb (mean size: 902 bp). (B) Bioanalyzer electropherogram of a selected DNA library by cutting from the agarose gel. We selected large fragments with sizes ranging from 500 to 2,000 bp to remove short DNA fragments for effective HLA gene haplotype phasing. The size selection also determines an actual molar concentration for bridge PCR to generate clusters in flowcell, because DNA fragments with over 1.5 kb size are not efficiently amplified. The mean size of the selected fragments was 1,561 bp.
Figure 2
Figure 2
Schematic workflow of the phase-defined HLA gene sequencing. (A) The individual tagging method for HLA homozygous samples. The 2 × 250 bp paired-end reads of the pooled amplicons were aligned to six HLA gene sequences from the hg19 and the consensus sequences were determined. Most of the analytical tools shown here are standard use for genome sequence alignment and variant detection. For generating consensus sequence, we used original perl scripts to list variants and to construct HLA gene sequences. (B) The gene-tagging method for HLA heterozygous samples. The 2 × 250 bp paired-end reads of the each amplicon were aligned to the corresponding gene and six genes were separately analyzed to avoid mismapping. In the alignment step, 2 × 250 bp paired-end sequence reads were aligned to reference sequence using BWA and SAMtools. SNVs were detected by UnifiedGenotyper in GATK. Paired-end reads harboring SNVs in both forward and reverse reads were extracted to construct two phased HLA gene haplotype sequences using our original perl script. Finally, two HLA gene haplotype sequences from an individual were generated with phase-defined SNVs and Indels as HLA gene haplotypes.
Figure 3
Figure 3
The HLA alleles and HLA haplotypes in two trio (A and C) one quartet (B) families. Each individual in child-parents families was sequenced as described. Each HLA gene call was consistent with the hereditary pattern. HLA allele was inferred by the IMG/HLA database and shared between parents and child(ren) with consistent pattern and without recombination.

References

    1. Shiina T, Hosomichi K, Inoko H, Kulski JK. The HLA genomic loci map: expression, interaction, diversity and disease. J Hum Genet. 2009;5:15–39. - PubMed
    1. The MHC sequencing consortium. Complete sequence and gene map of a human major histocompatibility complex. Nature. 1999;401:921–923. doi: 10.1038/44853. - DOI - PubMed
    1. Stewart CA, Horton R, Allcock RJ, Ashurst JL, Atrazhev AM, Coggill P, Dunham I, Forbes S, Halls K, Howson JM, Humphray SJ, Hunt S, Mungall AJ, Osoegawa K, Palmer S, Roberts AN, Rogers J, Sims S, Wang Y, Wilming LG, Elliott JF, de Jong PJ, Sawcer S, Todd JA, Trowsdale J, Beck S. Complete MHC haplotype sequencing for common disease gene mapping. Genome Res. 2004;14:1176–1187. doi: 10.1101/gr.2188104. - DOI - PMC - PubMed
    1. Horton R, Gibson R, Coggill P, Miretti M, Allcock RJ, Almeida J, Forbes S, Gilbert JG, Halls K, Harrow JL, Hart E, Howe K, Jackson DK, Palmer S, Roberts AN, Sims S, Stewart CA, Traherne JA, Trevanion S, Wilming L, Rogers J, de Jong PJ, Elliott JF, Sawcer S, Todd JA, Trowsdale J, Beck S. Variation analysis and gene annotation of eight MHC haplotypes: the MHC Haplotype Project. Immunogenetics. 2008;60:1–18. doi: 10.1007/s00251-007-0262-2. - DOI - PMC - PubMed
    1. Traherne JA, Horton R, Roberts AN, Miretti MM, Hurles ME, Stewart CA, Ashurst JL, Atrazhev AM, Coggill P, Palmer S, Almeida J, Sims S, Wilming LG, Rogers J, de Jong PJ, Carrington M, Elliott JF, Sawcer S, Todd JA, Trowsdale J, Beck S. Genetic analysis of completely sequenced disease-associated MHC haplotypes identifies shuffling of segments in recent human history. PLoS Genet. 2006;2:e9. doi: 10.1371/journal.pgen.0020009. - DOI - PMC - PubMed

Publication types