Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar;33(3):448-462.
doi: 10.1101/gr.277429.122. Epub 2023 Feb 28.

Complete sequencing of a cynomolgus macaque major histocompatibility complex haplotype

Affiliations

Complete sequencing of a cynomolgus macaque major histocompatibility complex haplotype

Julie A Karl et al. Genome Res. 2023 Mar.

Abstract

Macaques provide the most widely used nonhuman primate models for studying the immunology and pathogenesis of human diseases. Although the macaque major histocompatibility complex (MHC) region shares most features with the human leukocyte antigen (HLA) region, macaques have an expanded repertoire of MHC class I genes. Although a chimera of two rhesus macaque MHC haplotypes was first published in 2004, the structural diversity of MHC genomic organization in macaques remains poorly understood owing to a lack of adequate genomic reference sequences. We used ultralong Oxford Nanopore and high-accuracy Pacific Biosciences (PacBio) HiFi sequences to fully assemble the ∼5.2-Mb M3 haplotype of an MHC-homozygous, Mauritian-origin cynomolgus macaque (Macaca fascicularis). The MHC homozygosity allowed us to assemble a single MHC haplotype unambiguously and avoid chimeric assemblies that hampered previous efforts to characterize this exceptionally complex genomic region in macaques. The high quality of this new assembly is exemplified by the identification of an extended cluster of six Mafa-AG genes that contains a recent duplication with a highly similar ∼48.5-kb block of sequence. The MHC class II region of this M3 haplotype is similar to the previously sequenced rhesus macaque haplotype and HLA class II haplotypes. The MHC class I region, in contrast, contains 13 MHC-B genes, four MHC-A genes, and three MHC-E genes (vs. 19 MHC-B, two MHC-A, and one MHC-E in the previously sequenced haplotype). These results provide an unambiguously assembled single contiguous cynomolgus macaque MHC haplotype with fully curated gene annotations that will inform infectious disease and transplantation research.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Gene content of the cy0333 MHC M3 haplotype of MCM full genomic region. The MHC full genomic region is defined as everything from the telomeric gene GABBR1 to the centromeric gene KIFC1, as per the method of Shiina et al. (2017). Gene content is separated into putatively expressed genes on the left line and noncoding RNAs and pseudogenes on the right line. Gene sequences associated with the telomeric to centromeric sense strand are displayed to the right of each line, and those associated with the antisense strand are displayed to the left. The class I A region is highlighted in yellow; the class I B region is highlighted in orange; the class III region is highlighted in blue; and the class II region is highlighted in green. The position in megabases is shown on the left.
Figure 2.
Figure 2.
Comparison of the representative gene content of the human and the cy0333 MCM full genomic MHC regions. All protein-coding genes plus a subset of nonclassical MHC-I and MHC-II pseudogenes (MHC-V, MHC-W, MHC-K, etc.) and noncoding alleles of otherwise coding genes (Mafa-B19Ps*01:01:01:01, etc.) are shown for the human genome reference GRCh38 (left) and cy0333 (right). Protein-coding genes are shown in black font, and all pseudogenes and nonfunctioning alleles are shown in gray font. The class I A region is highlighted in yellow; the class I B region is highlighted in orange; the class III region is highlighted in blue; and the class II region is highlighted in green. Both sequences are equivalently scaled and presented from telomeric genes at the top to centromeric genes at the bottom. Gene sequences associated with the telomeric to centromeric sense strand are displayed to the right of each line, and those associated with the antisense strand are displayed to the left. The relative position in megabases for both sequences is shown on the left.
Figure 3.
Figure 3.
Continuity of the cy0333 MCM full genomic MHC region versus three publicly available macaque reference genomes. The cy0333 MHC region (OP204634) is shown in the top panel; a cynomolgus macaque reference genome Macaca_fascicularis_5.0 (GCF_000364345) assembled from the Illumina short-read data is shown in the second panel down; the rhesus macaque reference genome Mmul_10 (GCF_003339765) assembled from the ∼13.5-kb PacBio RSII data is shown in the third panel down; and a cynomolgus macaque reference genome MFA1912RKSv2 (GCF_012559485) assembled from the ∼12-kb PacBio Sequel II data is shown in the bottom panel. Annotations for all protein-coding genes plus a subset of nonclassical MHC-I and MHC-II pseudogenes are shown for each sequence. Protein-coding genes are shown in black font, and pseudogenes are shown in gray font. All sequences are uniformly scaled, and the position in megabases for all sequences is shown across the top. Gray tracks represent the sequence for each reference, and white gaps within the gray tracks represent gaps in the assembly. The class I A region is highlighted in yellow; the class I B region is highlighted in orange, the class III region is highlighted in blue; and the class II region is highlighted in green. Gene annotations for Macaca_fascicularis_5.0, Mmul_10, and MFA1912RKSv2 are sourced from and named consistent with NCBI, and gene sizes correspond to the longest predicted translation per gene; some of these predicted translations (particularly the exceptionally long black bars present on some of the references) require further validation and refinement.
Figure 4.
Figure 4.
Gene content of the cy0333 MHC-I AG/G region. Representative genes and pseudogenes are shown on the gray bar. The blue boxes highlight a virtually nucleotide-identical ∼47-kb block of sequence located at two distinct positions within the class I AG/G region; this block contains highly similar Mafa-V*01:02 and Mafa-G_pseudo1 genes along with three additional pseudogenes not shown. The pink and green boxes highlight closely related (98.4%) ∼48.5-kb blocks of sequence located just centromeric to each of the virtually nucleotide-identical Mafa-G/Mafa-V blocks. The pink and green boxes contain closely related Mafa-W genes (Mafa-W*01:11 and Mafa-W*01:12, differing at 20 positions across 2196 nt) and closely related Mafa-AG6 genes (Mafa-AG6*04:05:03:03 and Mafa-AG6*04:08:01:01, differing at 36 positions across 2534 nt), as well as three additional closely related duplicated pseudogenes not shown. The position in kilobases relative to the start of the full genomic MHC region is shown across the top. The bottom portion shows the spans of individual ONT reads located within the class I AG/G region. The three ONT reads shown in blue span both copies of the essentially nucleotide-identical Mafa-G/Mafa-V genes.
Figure 5.
Figure 5.
Gene content of the cy0333 MHC class I B region. Blocks of duplicated genes and pseudogenes are shaded. Orange shading indicates blocks containing essentially all six pseudogenes surrounding the classical class I Mafa-B gene, consistent with the pseudogenes surrounding the human HLA-B gene. Brown shading indicates blocks containing only two or three of the surrounding pseudogenes plus the Mafa-B gene. Pink shading indicates blocks containing only the Mafa-B gene and a microRNA located within intron 4 of the Mafa-B gene. Mafa-B allele names are displayed in bold. Position in megabases relative to the start of the full genomic MHC region is shown across the top.

Similar articles

Cited by

References

    1. Adams EJ, Parham P. 2001. Species-specific evolution of MHC class I genes in the higher primates. Immunol Rev 183: 41–64. 10.1034/j.1600-065x.2001.1830104.x - DOI - PubMed
    1. Anderson DJ, Kirk AD. 2013. Primate models in organ transplantation. Cold Spring Harb Perspect Med 3: a015503. 10.1101/cshperspect.a015503 - DOI - PMC - PubMed
    1. Anderson JL, Sandstrom K, Smith WR, Wetzel M, Klenchin VA, Evans DT. 2022. MHC class I ligands of rhesus macaque killer-cell immunoglobulin-like receptors. bioRxiv 10.1101/2022.05.25.493479 - DOI - PMC - PubMed
    1. Boegel S, Löwer M, Bukur T, Sorn P, Castle JC, Sahin U. 2018. HLA and proteasome expression body map. BMC Med Genomics 11: 36. 10.1186/s12920-018-0354-x - DOI - PMC - PubMed
    1. Boyson JE, Iwanaga KK, Urvater JA, Hughes AL, Golos TG, Watkins DI. 1999. Evolution of a new nonclassical MHC class I locus in two Old World primate species. Immunogenetics 49: 86–98. 10.1007/s002510050467 - DOI - PubMed

Publication types