A new rhesus macaque assembly and annotation for next-generation sequencing analyses
- PMID: 25319552
- PMCID: PMC4214606
- DOI: 10.1186/1745-6150-9-20
A new rhesus macaque assembly and annotation for next-generation sequencing analyses
Abstract
Background: The rhesus macaque (Macaca mulatta) is a key species for advancing biomedical research. Like all draft mammalian genomes, the draft rhesus assembly (rheMac2) has gaps, sequencing errors and misassemblies that have prevented automated annotation pipelines from functioning correctly. Another rhesus macaque assembly, CR_1.0, is also available but is substantially more fragmented than rheMac2 with smaller contigs and scaffolds. Annotations for these two assemblies are limited in completeness and accuracy. High quality assembly and annotation files are required for a wide range of studies including expression, genetic and evolutionary analyses.
Results: We report a new de novo assembly of the rhesus macaque genome (MacaM) that incorporates both the original Sanger sequences used to assemble rheMac2 and new Illumina sequences from the same animal. MacaM has a weighted average (N50) contig size of 64 kilobases, more than twice the size of the rheMac2 assembly and almost five times the size of the CR_1.0 assembly. The MacaM chromosome assembly incorporates information from previously unutilized mapping data and preliminary annotation of scaffolds. Independent assessment of the assemblies using Ion Torrent read alignments indicates that MacaM is more complete and accurate than rheMac2 and CR_1.0. We assembled messenger RNA sequences from several rhesus tissues into transcripts which allowed us to identify a total of 11,712 complete proteins representing 9,524 distinct genes. Using a combination of our assembled rhesus macaque transcripts and human transcripts, we annotated 18,757 transcripts and 16,050 genes with complete coding sequences in the MacaM assembly. Further, we demonstrate that the new annotations provide greatly improved accuracy as compared to the current annotations of rheMac2. Finally, we show that the MacaM genome provides an accurate resource for alignment of reads produced by RNA sequence expression studies.
Conclusions: The MacaM assembly and annotation files provide a substantially more complete and accurate representation of the rhesus macaque genome than rheMac2 or CR_1.0 and will serve as an important resource for investigators conducting next-generation sequencing studies with nonhuman primates.
Reviewers: This article was reviewed by Dr. Lutz Walter, Dr. Soojin Yi and Dr. Kateryna Makova.
Figures





Similar articles
-
Exome screening to identify loss-of-function mutations in the rhesus macaque for development of preclinical models of human disease.BMC Genomics. 2016 Mar 2;17:170. doi: 10.1186/s12864-016-2509-5. BMC Genomics. 2016. PMID: 26935327 Free PMC article.
-
Advantages of an Improved Rhesus Macaque Genome for Evolutionary Analyses.PLoS One. 2016 Dec 2;11(12):e0167376. doi: 10.1371/journal.pone.0167376. eCollection 2016. PLoS One. 2016. PMID: 27911958 Free PMC article.
-
Limitations of the rhesus macaque draft genome assembly and annotation.BMC Genomics. 2012 May 30;13:206. doi: 10.1186/1471-2164-13-206. BMC Genomics. 2012. PMID: 22646658 Free PMC article.
-
Improving genome assemblies and annotations for nonhuman primates.ILAR J. 2013;54(2):144-53. doi: 10.1093/ilar/ilt037. ILAR J. 2013. PMID: 24174438 Free PMC article. Review.
-
Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation.F1000Res. 2023 Sep 25;12:1205. doi: 10.12688/f1000research.139488.1. eCollection 2023. F1000Res. 2023. PMID: 37970066 Free PMC article. Review.
Cited by
-
Increased irritability, anxiety, and immune reactivity in transgenic Huntington's disease monkeys.Brain Behav Immun. 2016 Nov;58:181-190. doi: 10.1016/j.bbi.2016.07.004. Epub 2016 Jul 7. Brain Behav Immun. 2016. PMID: 27395434 Free PMC article.
-
Genomic resources for rhesus macaques (Macaca mulatta).Mamm Genome. 2022 Mar;33(1):91-99. doi: 10.1007/s00335-021-09922-z. Epub 2022 Jan 9. Mamm Genome. 2022. PMID: 34999909 Free PMC article.
-
Single-cell sequencing of primate preimplantation embryos reveals chromosome elimination via cellular fragmentation and blastomere exclusion.Genome Res. 2019 Mar;29(3):367-382. doi: 10.1101/gr.239830.118. Epub 2019 Jan 25. Genome Res. 2019. PMID: 30683754 Free PMC article.
-
Copy number variants and fixed duplications among 198 rhesus macaques (Macaca mulatta).PLoS Genet. 2020 May 11;16(5):e1008742. doi: 10.1371/journal.pgen.1008742. eCollection 2020 May. PLoS Genet. 2020. PMID: 32392208 Free PMC article.
-
Variable Baseline Papio cynocephalus Endogenous Retrovirus (PcEV) Expression Is Upregulated in Acutely SIV-Infected Macaques and Correlated to STAT1 Expression in the Spleen.Front Immunol. 2019 May 15;10:901. doi: 10.3389/fimmu.2019.00901. eCollection 2019. Front Immunol. 2019. PMID: 31156613 Free PMC article.
References
-
- Gibbs RA, Rogers J, Katze MG, Bumgarner R, Weinstock GM, Mardis ER, Remington KA, Strausberg RL, Venter JC, Wilson RK, Batzer MA, Bustamante CD, Eichler EE, Hahn MW, Hardison RC, Makova KD, Miller W, Milosavljevic A, Palermo RE, Siepel A, Sikela JM, Attaway T, Bell S, Bernard KE, Buhay CJ, Chandrabose MN, Dao M, Davis C, Delehaunty KD, Ding Y. et al.Evolutionary and biomedical insights from the rhesus macaque genome. Science. 2007;316:222–234. - PubMed
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases