Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct;21(10):1746-56.
doi: 10.1101/gr.123117.111. Epub 2011 Aug 23.

Genome-based analysis of the nonhuman primate Macaca fascicularis as a model for drug safety assessment

Affiliations

Genome-based analysis of the nonhuman primate Macaca fascicularis as a model for drug safety assessment

Martin Ebeling et al. Genome Res. 2011 Oct.

Abstract

The long-tailed macaque, also referred to as cynomolgus monkey (Macaca fascicularis), is one of the most important nonhuman primate animal models in basic and applied biomedical research. To improve the predictive power of primate experiments for humans, we determined the genome sequence of a Macaca fascicularis female of Mauritian origin using a whole-genome shotgun sequencing approach. We applied a template switch strategy that uses either the rhesus or the human genome to assemble sequence reads. The sixfold sequence coverage of the draft genome sequence enabled discovery of about 2.1 million potential single-nucleotide polymorphisms based on occurrence of a dimorphic nucleotide at a given position in the genome sequence. Homology-based annotation allowed us to identify 17,387 orthologs of human protein-coding genes in the M. fascicularis draft genome, and the predicted transcripts enabled the design of a M. fascicularis-specific gene expression microarray. Using liver samples from 36 individuals of different geographic origin we identified 718 genes with highly variable expression in liver, whereas the majority of the transcriptome shows relatively stable and comparable expression. Knowledge of the M. fascicularis draft genome is an important contribution to both the use of this animal in disease models and the safety assessment of drugs and their metabolites. In particular, this information allows high-resolution genotyping and microarray-based gene-expression profiling for animal stratification, thereby allowing the use of well-characterized animals for safety testing. Finally, the genome sequence presented here is a significant contribution to the global "3R" animal welfare initiative, which has the goal to reduce, refine, and replace animal experiments.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Coverage histogram of the M. fascicularis genome draft. The histogram shows the coverage distribution from the combined 454 and SOLiD reads of the M. fascicularis genome mapped to the reference genome (RheMac2). The frequency of nucleotide positions was plotted against the sequence coverage. Coverage exhibited a Poisson-like distribution with a mean of sixfold sequence coverage. The lowest bin in the histogram represents reference positions with zero aligned reads. About 80% of these noncovered positions are annotated as being repetitive in the RheMac2 draft.
Figure 2.
Figure 2.
Representative examples of mapping efficiencies of M. fascicularis WGS reads to rhesus or human template genomes with different repeat and gap content. (A) Graphical comparison of a 50-kb fragment on chromosome 1 with low repeat and gap content. Unique sequences in rhesus in general have better coverage in M. fascicularis than repetitive segments, and gaps cannot be closed as an inherent feature of the WGS approach. For rhesus (top), unique stretches are shown in red, repetitive DNA in blue, and gaps in green. For the corresponding M. fascicularis section (middle) the local coverage is indicated from onefold (light gray) to ≥ sixfold (black), and gaps are shown in green. For human (bottom), only exons are shown as reference. (B) Display of a chromosome 1 region with increased gap and repeat content, where average coverage in M. fascicularis is significantly reduced due to ambiguous mapping. In some cases, the long 454 reads reduce gap size relative to rhesus. (C) Human genome-based identification of exons in a 20-kb rhesus genome gap based on homology and conservation of intron/exon boundaries in primates.
Figure 3.
Figure 3.
Sequence identities between orthologous transcripts of M. fascicularis, M. mulatta, and H. sapiens. The 5′ UTR (left), CDS (middle), and 3′ UTR (right) of 10,919 orthologous mRNAs were considered separately for the calculation of pairwise sequence identities. The relative number of 1:1 orthologous sequences was plotted against the sequence identities. Frequency plots of sequence identities <100% between M. fascicularis and M. mulatta (Mf Mm), H. sapiens and M. fascicularis (Hs Mf), and H. sapiens and M. mulatta (Hs Mm) transcripts are displayed. Note that the peak sequence identities for the UTRs are significantly lower between humans and macaques than for the coding regions.
Figure 4.
Figure 4.
SLCO solute carrier gene family evolution in primates (H. sapiens, M. mulatta, and M. fascicularis). (A) Phylogenetic tree of SLCO divergence based on DNAML maximum likelihood analysis from the PHYLIP software package. Human orthologs of SLCO-encoding genes were identified in the draft genomes of M. fascicularis and M. mulatta, and the calculated sequence relationship is shown in “substitution events per residue” units. Thin black lines denote long-distance relationships, and bold lines are used to display closer relationships, shown at 10-fold magnification for better resolution. Gray lines and red lines mark separate evolution in macaques and humans, respectively. The drug transporters SLCO1B3 and LST (RefSeq NM_001009562) show the highest degree of sequence diversity within macaques (marked by yellow circles). Differences below 0.35% within the macaques are marked by blue and green symbols. For simplicity, “SLCO” was omitted for labeling of individual family members. (B) Independent diversification of SLCO genes in primates. SLCO sequence divergence between M. fascicularis and human (Mf and Hs; x-axis), and between M. fascicularis and M. mulatta (Mf and Mm; y-axis) orthologs are displayed. SLCO6A1 shows the highest divergence between M. fascicularis and humans, and SLCO1B3 is the most diverse gene of the SLCO family within macaques. The black line indicates the average value and the hatched lines indicate ±SD. Note the different scales of the x- and y-axis.
Figure 5.
Figure 5.
Microarray-based gene expression profiling in M. fascicularis. (A) Global variability of liver gene expression in 36 naive animals. Low-variance genes (LVGs; black dots) and high-variance genes (HVGs; red dots) were identified based on calculation of quantile differences of probe intensities. The 90% quantile of the log2 probe signal intensities is plotted on the x-axis, and the difference of the 90% and the 10% quantiles are plotted on the y-axis. The detection limit is marked by a dotted line. The metalloproteinase MT1B and the uridine phosphorylase 2 (UPP2) data points are denoted by a hatched green circle. (B) HVG-based clustering of animals according to geographical origin. Principal component analysis based on all HVG gene expression signals discriminates between animals from the Philippines (red dots) and animals from Mauritius (green dots) or a Chinese breeder (yellow dots). (C) Variability in baseline gene expression of cytochrome p450 isoforms and a panel of cytokines and response-related genes routinely used for drug safety assessment in humans. Scatter plots of expression levels (log2 values) of unambiguously annotated M. fascicularis cytochrome p450 genes (left) as well as key cytokines and the chemokine CCL2 (right). Data are sorted according to expression levels in ascending order from left to right. Black circles indicate the expression signals of individual animals. The mean expression signal per gene is depicted by red bars for M. fascicularis and blue bars for H. sapiens. The detection limit of the microarray platforms is indicated by dotted lines (red: M. fascicularis NimbleGen array; blue: Affymetrix human array). Green arrows denote differences in baseline expression levels in the two species.

References

    1. Altshuler D, Brooks LD, Chakravarti A, Collins FS, Daly JM, Donnelly P, Gibbs RA, Yang H, Zeng C, Gabriel SB, et al. 2005. A haplotype map of the human genome. Nature 437: 1299–1320 - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29 - PMC - PubMed
    1. Bairoch A, Boeckmann B, Ferro S, Gasteiger E 2004. Swiss-Prot: juggling between evolution and stability. Brief Bioinform 5: 39–55 - PubMed
    1. Boelsterli UA 2003. Animal models of human disease in drug safety assessment. J Toxicol Sci 28: 109–121 - PubMed
    1. Bolstad BM, Irizarry RA, Astrand M, Speed TP 2003. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19: 185–193 - PubMed

Publication types

MeSH terms

Associated data