Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug;90(2):79-87.
doi: 10.1111/tan.13057. Epub 2017 May 25.

Dual redundant sequencing strategy: Full-length gene characterisation of 1056 novel and confirmatory HLA alleles

Affiliations

Dual redundant sequencing strategy: Full-length gene characterisation of 1056 novel and confirmatory HLA alleles

V Albrecht et al. HLA. 2017 Aug.

Abstract

The high-throughput department of DKMS Life Science Lab encounters novel human leukocyte antigen (HLA) alleles on a daily basis. To characterise these alleles, we have developed a system to sequence the whole gene from 5'- to 3'-UTR for the HLA loci A, B, C, DQB1 and DPB1 for submission to the European Molecular Biology Laboratory - European Nucleotide Archive (EMBL-ENA) and the IPD-IMGT/HLA Database. Our workflow is based on a dual redundant sequencing strategy. Using shotgun sequencing on an Illumina MiSeq instrument and single molecule real-time (SMRT) sequencing on a PacBio RS II instrument, we are able to achieve highly accurate HLA full-length consensus sequences. Remaining conflicts are resolved using the R package DR2S (Dual Redundant Reference Sequencing). Given the relatively high throughput of this strategy, we have developed the semi-automated web service TypeLoader, to aid in the submission of sequences to the EMBL-ENA and the IPD-IMGT/HLA Database. In the IPD-IMGT/HLA Database release 3.24.0 (April 2016; prior to the submission of the sequences described here), only 5.2% of all known HLA alleles have been fully characterised together with intronic and UTR sequences. So far, we have applied our strategy to characterise and submit 1056 HLA alleles, thereby more than doubling the number of fully characterised alleles. Given the increasing application of next generation sequencing (NGS) for full gene characterisation in clinical practice, extending the HLA database concomitantly is highly desirable. Therefore, we propose this dual redundant sequencing strategy as a workflow for submission of novel full-length alleles and characterisation of sequences that are as yet incomplete. This would help to mitigate the predominance of partially known alleles in the database.

Keywords: HLA typing; NGS; PacBio; full-length gene sequencing; novel HLA alleles.

PubMed Disclaimer

Conflict of interest statement

The authors have declared no conflicting interests.

Figures

Figure 1
Figure 1
Genomic organisation of the human leukocyte antigen (HLA) loci. The class II alleles are about 2 to 3 times the length of the class I alleles. All primers applied during this project are located outside the UTR regions
Figure 2
Figure 2
Workflow for full‐length HLA gene characterisation showing the dual redundant sequencing strategy using the MiSeq and PacBio RS II platforms. A, MiSeq requires a fragmentation step owing to its inability to completely sequence molecule fragments longer than 600 bp; barcodes are attached during library preparation. B, Barcoding is carried out as a part of the polymerase chain reaction (PCR)
Figure 3
Figure 3
Limitations of each sequencing method. A, Illustration of the inability of accurate phasing using short sequencing‐by‐synthesis (SBS) reads (DPB1*02:new represents a yet unnamed novel allele). B, Inability to call homopolymer consensus sequences accurately due to a high insertion/deletion sequencing error rate with long single molecule real‐time (SMRT) reads
Figure 4
Figure 4
Effects on the IPD‐IMGT/HLA Database. The number of fully characterised human leukocyte antigen (HLA) alleles including the 5′‐ and 3′‐UTR in the IPD‐IMGT/HLA Database release 3.27.0 was more than doubled after the submission of 898 unique full‐length sequences (submitted novel alleles [red], genomic extension of extant allele sequences [green]. Extant fully characterised HLA alleles in IPD‐IMGT/HLA Database release 3.27.0 [blue])

References

    1. Horton R, Wilming L, Rand V, et al. Gene map of the extended human MHC. Nat Rev Genet. 2004;5:889‐899. 10.1038/nrg1489. - DOI - PubMed
    1. Tait BD. The ever‐expanding list of HLA alleles: changing HLA nomenclature and its relevance to clinical transplantation. Transplant Rev. 2011;25:1‐8. 10.1016/j.trre.2010.08.001. - DOI - PubMed
    1. Trowsdale J, Knight JC. Major histocompatibility complex genomics and human disease. Annu Rev Genomics Hum Genet. 2013;14:301‐323. 10.1146/annurev-genom-091212-153455. - DOI - PMC - PubMed
    1. Robinson J, Halliwell JA, Hayhurst JD, Flicek P, Parham P, Marsh SGE. The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res. 2015;43(D1):D423‐D431. 10.1093/nar/gku1161. - DOI - PMC - PubMed
    1. Robinson J, Soormally AR, Hayhurst JD, Marsh SGE. The IPD‐IMGT/HLA Database – new developments in reporting HLA variation. Hum Immunol. 2016;77:233‐237. 10.1016/j.humimm.2016.01.020. - DOI - PubMed