Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Nov 8:12:552.
doi: 10.1186/1471-2164-12-552.

Exploring the gonad transcriptome of two extreme male pigs with RNA-seq

Affiliations

Exploring the gonad transcriptome of two extreme male pigs with RNA-seq

Anna Esteve-Codina et al. BMC Genomics. .

Abstract

Background: Although RNA-seq greatly advances our understanding of complex transcriptome landscapes, such as those found in mammals, complete RNA-seq studies in livestock and in particular in the pig are still lacking. Here, we used high-throughput RNA sequencing to gain insight into the characterization of the poly-A RNA fraction expressed in pig male gonads. An expression analysis comparing different mapping approaches and detection of allele specific expression is also discussed in this study.

Results: By sequencing testicle mRNA of two phenotypically extreme pigs, one Iberian and one Large White, we identified hundreds of unannotated protein-coding genes (PcGs) in intergenic regions, some of them presenting orthology with closely related species. Interestingly, we also detected 2047 putative long non-coding RNA (lncRNA), including 469 with human homologues. Two methods, DEGseq and Cufflinks, were used for analyzing expression. DEGseq identified 15% less expressed genes than Cufflinks, because DEGseq utilizes only unambiguously mapped reads. Moreover, a large fraction of the transcriptome is made up of transposable elements (14500 elements encountered), as has been reported in previous studies. Gene expression results between microarray and RNA-seq technologies were relatively well correlated (r = 0.71 across individuals). Differentially expressed genes between Large White and Iberian showed a significant overrepresentation of gamete production and lipid metabolism gene ontology categories. Finally, allelic imbalance was detected in ~ 4% of heterozygous sites.

Conclusions: RNA-seq is a powerful tool to gain insight into complex transcriptomes. In addition to uncovering many unnanotated genes, our study allowed us to determine that a considerable fraction is made up of long non-coding transcripts and transposable elements. Their biological roles remain to be determined in future studies. In terms of differences in expression between Large White and Iberian pigs, these were largest for genes involved in spermatogenesis and lipid metabolism, which is consistent with phenotypic extreme differences in prolificacy and fat deposition between these two breeds.

PubMed Disclaimer

Figures

Figure 1
Figure 1
LncRNAs mammal conservation. The heatmap recapitulates the screening result of the new discovered 2047 pig lncRNAs versus eighteen mammal genomes. The columns represent the mammal genomes while the rows indicate the query lncRNAs. The spots indicate the result of the search of each pig lncRNA versus the different genomes. Green spots represent hits having high similarity scores. Black spots indicate low similarity scores. Red spots indicate that no homolog was detected.
Figure 2
Figure 2
Ven diagrams of the predicted homologues in human and cow. a) 469 pig lncRNA presented homology with human. 15 pig lncRNA overlap with human lncRNA, 316 overlap with human PcGs annotations and 131 lncRNA presented homology with unannotated human DNA regions. b) Comparison of lncRNAs having a homolog in human and in cow.
Figure 3
Figure 3
Measuring gene expression. a) DEGseq vs. Cufflinks estimates of log2 fold changes between Large White and Iberian expressed genes. Blue and red points correspond not expressed genes in microarrays and Cufflinks, respectively. Light blue and light red points correspond to microarray and Cufflinks infinite values. b) Microrray vs. RNA-Seq individual measurements. The microarray data correspond to signal intensity difference between Large White and Iberian, whereas the RNA-Seq measurement is the log2 fold change as obtained from Cufflinks. c) Microarray breed z-score values vs. RNA-Seq log2 fold change. The Pearson's correlations (r) were significant in each case (Pv < 2.2 × 10-16) and calculated considering only expressed genes and no infinite values.
Figure 4
Figure 4
Overlapping of differentially expressed genes. Top: Differentially expressed genes identified by DEGseq and Cufflinks. Bottom: Differentially expressed genes identified by microarrays (breed z-scores) and RNA-Seq (Cufflinks).
Figure 5
Figure 5
Expression levels according to annotation. a) Boxplots of expression level (log10 FKPM) for annotated coding genes, novel coding genes, lincRNA and transcripts with TE. The black line represents the median. b) Boxplots of the transcript unit length in base pairs (log10). c) Boxplots of the GC content (log10) using the reference annotation for transcriptome assembly. d) Boxplots of the GC content (log10) without using the reference annotation.
Figure 6
Figure 6
Allele specific expression. a) Coverage versus posterior mean of allele transcription rate (p); each point represents a SNP; red points are SNP showing significant ASE and black points are SNPs with no significant ASE. b) Barplot of coverage versus absolute value of p. It can be seen that there was not a consistent relation between ASE and coverage.

Similar articles

Cited by

References

    1. Jacquier A. The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs. Nat Rev Genet. 2009;10(12):833–844. - PubMed
    1. Shabalina SA, Spiridonov NA. The mammalian transcriptome and the function of non-coding DNA sequences. Genome Biol. 2004;5(4):105. doi: 10.1186/gb-2004-5-4-105. - DOI - PMC - PubMed
    1. Lindberg J, Lundeberg J. The plasticity of the mammalian transcriptome. Genomics. pp. 1–6. - PubMed
    1. Gustincich S, Sandelin A, Plessy C, Katayama S, Simone R, Lazarevic D, Hayashizaki Y, Carninci P. The complexity of the mammalian transcriptome. J Physiol. 2006;575(Pt 2):321–332. - PMC - PubMed
    1. Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol. pp. 503–510. - PMC - PubMed

Publication types