Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 26;24(1):601.
doi: 10.1186/s12870-024-05171-9.

The cacao gene atlas: a transcriptome developmental atlas reveals highly tissue-specific and dynamically-regulated gene networks in Theobroma cacao L

Affiliations

The cacao gene atlas: a transcriptome developmental atlas reveals highly tissue-specific and dynamically-regulated gene networks in Theobroma cacao L

Evelyn Kulesza et al. BMC Plant Biol. .

Abstract

Background: Theobroma cacao, the cocoa tree, is a tropical crop grown for its highly valuable cocoa solids and fat which are the basis of a 200-billion-dollar annual chocolate industry. However, the long generation time and difficulties associated with breeding a tropical tree crop have limited the progress of breeders to develop high-yielding disease-resistant varieties. Development of marker-assisted breeding methods for cacao requires discovery of genomic regions and specific alleles of genes encoding important traits of interest. To accelerate gene discovery, we developed a gene atlas composed of a large dataset of replicated transcriptomes with the long-term goal of progressing breeding towards developing high-yielding elite varieties of cacao.

Results: We describe the creation of the Cacao Transcriptome Atlas, its global characterization and define sets of genes co-regulated in highly organ- and temporally-specific manners. RNAs were extracted and transcriptomes sequenced from 123 different tissues and stages of development representing major organs and developmental stages of the cacao lifecycle. In addition, several experimental treatments and time courses were performed to measure gene expression in tissues responding to biotic and abiotic stressors. Samples were collected in replicates (3-5) to enable statistical analysis of gene expression levels for a total of 390 transcriptomes. To promote wide use of these data, all raw sequencing data, expression read mapping matrices, scripts, and other information used to create the resource are freely available online. We verified our atlas by analyzing the expression of genes with known functions and expression patterns in Arabidopsis (ACT7, LEA19, AGL16, TIP13, LHY, MYB2) and found their expression profiles to be generally similar between both species. We also successfully identified tissue-specific genes at two thresholds in many tissue types represented and a set of genes highly conserved across all tissues.

Conclusion: The Cacao Gene Atlas consists of a gene expression browser with graphical user interface and open access to raw sequencing data files as well as the unnormalized and CPM normalized read count data mapped to several cacao genomes. The gene atlas is a publicly available resource to allow rapid mining of cacao gene expression profiles. We hope this resource will be used to help accelerate the discovery of important genes for key cacao traits such as disease resistance and contribute to the breeding of elite varieties to help farmers increase yields.

Keywords: Cacao genomics; Gene expression; Tissue-specificity; Transcriptome atlas.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Botanical illustration of orthotropic and plagiotropic cacao plants. Illustrations showing (A) a 6-month-old orthotropic plant undergoing a new leaf flush and (B) the trunk and jorquette of a mature plagiotropic plant growing fruits and flowers. Orthotropic plants grow vertically, and produce leaves with spiral phyllotaxy, while plagiotropic plants form five branches growing at a fixed angle from vertical growth and producing leaves with alternate phyllotaxy (most leaves in image removed to highlight jorquette structure). Clusters of inflorescences form on main trunks or branches (commonly called floral cushions). A mature fruit (fruit) is depicted, a cross section and sub-tissues are shown in Fig. 2A. Scale bars are indicated in cm
Fig. 2
Fig. 2
Gene expression profile of cacao actin gene. Sub-atlases represented in the cacao gene atlas. Tissues represented across the six sub-atlases that compose the Cacao Gene Atlas including (A) the developmental atlas, (B) drought-diurnal atlas, (C) leaf infection atlas, (D) leaf development atlas, (E) meristem atlas, and (F) seed atlas. The expression of Actin (ACT7) is depicted, generally considered constitutively expressed in all tissues. Tissues are colored according to the mean number of mapped reads per million (CPM) of replicate samples. Color scale is depicted in each sub-atlas
Fig. 3
Fig. 3
Atlas validation gene expression via eFP browser. Gene expression as view in the BAR eFP browser of (A) Late embryogenesis abundant 19 (LEA19), expressed in late seed development and in response to drought, (B) Agamous-like 16 (AGL16), a transcription factor involved in specification of floral organ identity, (C) Gamma-tonoplast intrinsic protein 3 (TIP13), a root specific aquaporin transporter, (D) Late elongated hypocotyl (LHY), which is a transcription factor involved in regulating circadian rhythm and (E) a MYB transcription factor (MYB2), which regulates dehydration response in plants. A-C are represented by the Developmental Atlas, while D-E are the Drought and Diurnal Atlas. Tissues are colored according to the mean number of mapped reads per million (CPM) of replicate samples. Color scale is depicted in each sub-atlas
Fig. 4
Fig. 4
Validation of gene expression patterns with genes of known and highly conserved expression profiles. (A) Log2 of mean number of mapped reads per million for the developmental atlas. Genes represented include Actin (ACT7), generally considered constitutively expressed in all tissues, Late embryogenesis abundant 19 (LEA19), expressed in late seed development and in response to drought, Agamous-like 16 (AGL16), a transcription factor involved in specification of floral organ identity, and Gamma-tonoplast intrinsic protein 3 (TIP13), a root specific aquaporin transporter (B) Log2 of mean number of mapped reads per million read for the drought/diurnal atlas. Genes represented include a MYB transcription factor (MYB2), which regulates dehydration response in plants, and Late elongated hypocotyl (LHY), which is a transcription factor involved in regulating circadian rhythm
Fig. 5
Fig. 5
“Extremely” Tissue-specific genes in the T. cacao gene atlas. Bar plot displaying the number of “extremely” tissue-specific genes expressed above 30 CPM for each tissue type in the T. cacao gene atlas. Genes were identified as expressed if they reached a read count threshold greater than 30 CPM in most replicates as described in the Methods. Genes were defined as “extremely” tissue-specific if they met the 30 CPM threshold and were not identified as expressed in another tissue type at that threshold. Red dashed line represents the mean number of tissue-specific genes per library in the T. cacao gene atlas
Fig. 6
Fig. 6
“Functionally” Tissue-specific genes in the T. cacao gene atlas. Bar plot displaying the number of “functionally” tissue-specific genes expressed above 30 CPM for each tissue type in the T. cacao gene atlas. Genes were identified as expressed if they reached a read count threshold greater than 30 CPM in most replicates as described in the Methods. Genes were defined as “functionally” tissue-specific if they met the 30 CPM threshold and their mean expression in said tissue was twice the global expression exclusion said tissue for the same gene. Red dashed line represents the mean number of tissue-specific genes per library in the T. cacao gene atlas
Fig. 7
Fig. 7
Heatmap of Most and Least Conserved T. cacao Genes by Coefficient of Variation (CV). Heatmap of 10 most and least conserved genes by CV across all replicates in the T. cacao Gene Atlas. Genes are plotted across the x-axis and tissue types are plotted across the y-axis. Expression values are plotted as log-transformed means for a respective tissue. Read counts were normalized by adding one read count to each value. Mean expression and CV for a respective gene across the atlas are plotted in bar plots underneath each heatmap. Heatmaps and bar plots were assembled using the R package ComplexHeatmap

References

    1. Leach M, Crops T, Crops C. Rainforest relations. Edinburgh: Edinburgh University; 1994. pp. 100–29.
    1. ICCO. International Cocoa Organization. 2023; https://www.icco.org/home/.
    1. Rajab YA, et al. Effects of shade tree cover and diversity on root system structure and dynamics in cacao agroforests: the role of root competition and space partitioning. Plant Soil. 2018;422:349. doi: 10.1007/s11104-017-3456-x. - DOI
    1. Borda A, et al. Addressing Sustainable Rural Development with Shared Value: a Peruvian model from the Cacao Industry. Sustainability. 2021;13(14):8028. doi: 10.3390/su13148028. - DOI
    1. Suárez LR, et al. Cacao agroforestry systems improve soil fertility: comparison of soil properties between forest, cacao agroforestry systems, and pasture in the Colombian Amazon. Volume 314. Agriculture, Ecosystems & Environment; 2021. p. 107349.

LinkOut - more resources