Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 10;4(4):100527.
doi: 10.1016/j.xgen.2024.100527. Epub 2024 Mar 26.

A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats

Affiliations

A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats

Tristan V de Jong et al. Cell Genom. .

Abstract

The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ∼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.

Keywords: Rnor_6.0; genetic map; heterogeneous stock; hybrid rat diversity panel; inbred strains; mRatBN7.2; phylogenetic tree; rat; recombinant inbred; reference genome.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
mRatBN7.2 corrects structural errors in Rnor_6.0 (A) Genome-wide comparison between Rnor_6.0 and mRatBN7.2 showed many structural differences, such as a large inversion at proximal Chr 6 and many translocations between chromosomes. Image generated using the NCBI Comparative Genome Viewer. Numbers indicate chromosomes. Green lines indicate sequences in the forward alignment. Blue lines indicate reverse alignment. (B) The large inversion on proximal Chr 6 is shown in a dot plot between Rnor_6.0 and mRatBN7.2. (C) A rat genetic map generated using 150,835 binned markers from 1,893 heterogeneous stock rats showed an inversion at proximal Chr 6 between genetic distance and physical distance based on Rnor_6.0, indicating that the inversion is caused by assembly errors in Rnor_6.0. (D) Marker order and genetic distance from the genetic map on Chr 6 are in agreement with physical distance based on mRatBN7.2, indicating that the misassembly is fixed. (E–G) Genetic map confirms that many assembly errors on Chr 19 in Rnor_6.0 are fixed in mRatBN7.2.
Figure 2
Figure 2
mRatBN7.2 improved mapping statistics of whole-genome sequencing data Summary statistics from mapping 36 HXB/BXH WGS samples against Rnor_6.0 and mRatBN7.2 were compared. Using mRatBN7.2 increased the percentage of reads mapped (A), reduced regions on the reference genome with zero coverage (B), total number of SNPs (C), and indels (D). The presence of a large number of SNPs (E) and indels (F) that are shared by all samples (arrows), including BN/NHsdMcwi, indicates that they are base-level errors in the reference genome.
Figure 3
Figure 3
mRatBN7.2 improves eQTL and proteomic analysis Genome misassembly is associated with increased rates of calling spurious trans-eQTLs. (A) Each column represents a gene for which at least one trans-eQTL was found at p < 1 × 10−8 using Rnor_6.0. The color of bars indicate the number of trans-eQTL SNP-gene pairs in which the SNP and/or gene transcription start site (TSS) relocated to a different chromosome in mRatBN7.2 and whether the relocation would result in a reclassification to cis-eQTL (TSS distance < 1 Mb) or ambiguous (TSS distance is between 1–5 Mb). (B) Genomic location of one relocated trans-eQTL SNP from (A). The SNP is in a segment of Chr 13 in Rnor_6.0 that was relocated to Chr 3 in mRatBN7.2 (red stars), reclassifying the eQTL from trans-eQTL to cis-eQTL for both Ly75 and Itgb6 genes (red bars). (C) Histogram showing the distance between cis-pQTLs and TSS of the corresponding proteins. The distances of pQTLs in mRatBN7.2 tend to be closer than those in Rnor_6.0. (D) An example of trans-pQTL in Rnor_6.0 was detected as a cis-pQTL in mRatBN7.2. (E) Correlation of expression of the protein (the example in B) in Rnor_6.0 and mRatBN7.2. (F) Different annotations of the exemplar gene in Rnor_6.0 and mRatBN7.2.
Figure 4
Figure 4
Using WGS data to assess the quality of the Liftover from Rnor_6.0 to mRatBN7.2 (A) Overview of the workflow using a real WGS sample from a WKY rat. A higher portion of variants passed the quality filter for mRatBN7.2. Among them, 97.93% of the variants were liftable from Rnor_6.0 to mRatBN7.2. (B) The overlap between variants lifted from Rnor_6.0 and variants obtained by direct mapping sequence data to mRatBN7.2. Approximately 11.9% of the variants that were found from direct mapping were missing from the Liftover.
Figure 5
Figure 5
Phylogenetic relationship and genetic diversity of 120 strain/substrains of laboratory rats (A) The phylogenetic tree was constructed using 11.6 million biallelic SNPs from 163 samples. Strains/substrains with duplicated samples were condensed. Strains highlighted with bold fonts are parental strains for RI panels. Green, HXB/BXH RI panel; blue, FXLE/LEXF RI panel; orange, progenitors of the HS outbred population. (B) The HS progenitors contain 16,438,302 variants (i.e., 82.2% of the variants in our collection of 120 strain/substrains) based on analysis using mRatBN7.2. Among these, 10,895 are shared by all eight progenitor strains. The number of variants that are unique to each specific founder is noted. (C) The total number of variants per strain, with the total number unique to each strain marked. (D) The number of variants shared across strains per chromosome.

Update of

  • A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats.
    de Jong TV, Pan Y, Rastas P, Munro D, Tutaj M, Akil H, Benner C, Chen D, Chitre AS, Chow W, Colonna V, Dalgard CL, Demos WM, Doris PA, Garrison E, Geurts AM, Gunturkun HM, Guryev V, Hourlier T, Howe K, Huang J, Kalbfleisch T, Kim P, Li L, Mahaffey S, Martin FJ, Mohammadi P, Ozel AB, Polesskaya O, Pravenec M, Prins P, Sebat J, Smith JR, Solberg Woods LC, Tabakoff B, Tracey A, Uliano-Silva M, Villani F, Wang H, Sharp BM, Telese F, Jiang Z, Saba L, Wang X, Murphy TD, Palmer AA, Kwitek AE, Dwinell MR, Williams RW, Li JZ, Chen H. de Jong TV, et al. bioRxiv [Preprint]. 2023 Sep 28:2023.04.13.536694. doi: 10.1101/2023.04.13.536694. bioRxiv. 2023. Update in: Cell Genom. 2024 Apr 10;4(4):100527. doi: 10.1016/j.xgen.2024.100527. PMID: 37214860 Free PMC article. Updated. Preprint.

References

    1. Parker C.C., Chen H., Flagel S.B., Geurts A.M., Richards J.B., Robinson T.E., Solberg Woods L.C., Palmer A.A. Rats are the smart choice: Rationale for a renewed focus on rats in behavioral genetics. Neuropharmacology. 2014;76:250–258. - PMC - PubMed
    1. Richter C.P. The effects of domestication and selection on the behavior of the Norway rat. J. Natl. Cancer Inst. 1954;15:727–738. - PubMed
    1. Hulme-Beaman A., Orton D., Cucchi T. The origins of the domesticate brown rat (Rattus norvegicus) and its pathways to domestication. Anim. Front. 2021;11:78–86. - PMC - PubMed
    1. Modlinska K., Pisula W. The Norway rat, from an obnoxious pest to a laboratory pet. Elife. 2020;9 doi: 10.7554/eLife.50651. - DOI - PMC - PubMed
    1. Smith J.R., Hayman G.T., Wang S.-J., Laulederkind S.J.F., Hoffman M.J., Kaldunski M.L., Tutaj M., Thota J., Nalabolu H.S., Ellanki S.L.R., et al. The Year of the Rat: The Rat Genome Database at 20: a multi-species knowledgebase and analysis platform. Nucleic Acids Res. 2020;48:D731–D742. - PMC - PubMed