Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 2;76(10):2743-2762.
doi: 10.1093/jxb/eraf005.

Chronology of transcriptome and proteome expression during early Arabidopsis flower development

Affiliations

Chronology of transcriptome and proteome expression during early Arabidopsis flower development

Raquel Álvarez-Urdiola et al. J Exp Bot. .

Abstract

The complex gene regulatory landscape underlying early flower development in Arabidopsis has been extensively studied through transcriptome profiling, and gene networks controlling floral organ development have been derived from the analyses of genome-wide binding of key transcription factors. In contrast, the dynamic nature of the proteome during the flower development process is much less understood. In this study, we characterized the floral proteome at different stages during early flower development and correlated it with unbiased transcript expression data. Shotgun proteomics and transcript profiling were conducted using an APETALA1 (AP1)-based floral induction system. A specific analysis pipeline to process the time-course proteomics data was developed. In total, 8924 proteins and 23 069 transcripts were identified. Co-expression analysis revealed that RNA-protein pairs clustered in various expression pattern modules. An overall positive correlation between RNA and protein level changes was observed, but subgroups of RNA-protein pairs with anti-correlated gene expression changes were also identified and found to be enriched in hormone-responsive pathways. In addition, the RNA-seq dataset reported here further expanded the identification of genes whose expression changes during early flower development, and its combination with previously published AP1 ChIP-seq datasets allowed the identification of additional direct and high-confidence targets of AP1.

Keywords: APETALA1; Arabidopsis; co-expression analysis; flower development; proteome; target genes; transcriptome.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflicts of interests.

Figures

Fig. 1.
Fig. 1.
Overview of experimental design and the reliability analysis pipeline for addressing missing values in proteomics. (A) Experimental setup for the temporal transcriptomic and proteomic analysis of flower development. Inflorescence samples of four biological replicates of 40–80 pAP1:AP1-GR ap1 cal plants each were collected immediately after DEX application (D0), and at 1, 2, 3, 4, and 5 d (D1–D5) after treatment. Imputation of missing values considering their biological context: time-series abundances (log2TOP3) of AP3 (B) and TFL1 (C) before and after the reliability analysis (RA), and after k-nearest neighbor (kNN) imputation. (D) Proportion of proteins considered as reliably detected (RD), unreliably detected (UD), unreliably undetected (UU), and reliably undetected (RU) for each time point. (E) Distribution of peptide-based sequence coverage of proteins that were reliably or unreliably detected in at least one time point—day—(quantified), and of those that were reliably or unreliably undetected at every time point (discarded). (F) Pie charts displaying percentage and number of proteins identified by <3, 3–10, or >10 peptidic fragments before and after the reliability analysis (the peptide-based coverage of 1685 of the 1891 discarded proteins was lower than three peptides per protein).
Fig. 2.
Fig. 2.
Expression of the ‘supermarker’ proteins throughout the time-course and effects of data imputation. Protein abundance is shown as log2(TOP3), before and after the reliability analysis (RA), and after kNN imputation. mRNA abundance as normalized counts is also shown.
Fig. 3.
Fig. 3.
Expression behavior of stage-variant proteins and genes (SVPs and SVGs). (A) Heatmap displaying the abundance levels of the 2037 SVPs throughout the time-course. Color scale represents Z-scored TOP3 values. (B) Gene expression patterns of the 8125 SVGs. Color scale represents Z-scored normalized RNA counts.
Fig. 4.
Fig. 4.
Differentially expressed genes during early flower development. Comparison of the RNA-seq data from this study with previous microarray and RNA-seq data (Table 3). (A) Venn diagram showing the number of RNA-seq DEGs and SVGs identified in this study and of DEGs from previous reports, and the overlap between the datasets. (B) Number of up- and down-regulated DEGs identified in this study (RNA-seq) compared with those in Wellmer et al. (2006) (W) and in Ryan et al. (2015) (R) (microarray data) at each day-to-previous-day comparison (adj. P-value <0.05 for our data and Wellmer et al., and adj. P-value <0.01 for Ryan et al.).
Fig. 5.
Fig. 5.
Identification of AP1 target genes. (A) Overlap between AP1 target genes identified in different studies. AP1 high-confidence target genes were as defined in Kaufmann et al. (2010), or in this study if they appeared on the list of putative targets from Kaufmann et al. (2010) and were detected as robustly differentially expressed in the D1 versus D0 RNA-seq data comparison. Genes were defined as AP1 confidence targets if they appeared on the target list from Chen et al. (2018) and were also detected as differentially expressed in that study, or also detected as robustly differentially expressed in at least one day-to-previous day comparison of the RNA-seq dataset reported here. See the Materials and methods. Data are from Supplementary Table S6. (B) Gene Ontology (GO) enrichment analysis of the novel AP1 targets identified in this study. The set of 262 novel AP1 targets includes 105 high-confidence targets and 157 confidence targets. Dot plot representing the most enriched biological process GO categories (adjusted P-value <0.05) for the novel AP1 targets, colored by their adjusted P-value. Dot size represents the number of genes identified in each category. Gene ratio is the proportion between the number of identified genes for each category and the total number of genes registered for that category. (C) Enriched biological process GO categories for the AP1 high-confidence targets that code for transcription factors, colored by their log fold change (LFC) in the D1 versus D0 comparison. (D) Enriched biological process GO categories for the AP1 confidence targets that code for transcription factors, colored by their gene expression pattern (as described in Fig. 3).
Fig. 6.
Fig. 6.
Gene and protein classification according to abundance through time. (A) Histogram of RNA expression range. Gray, all detected protein-coding transcripts; orange, protein-coding transcripts not detected as a protein by LC-MS/MS; dark purple, protein-coding transcripts quantified as a protein; light purple, transcripts corresponding to a protein identified by LC-MS/MS but that was discarded because it was classified as undetected (UU or RU) at every time point (i.e. not quantified). Dashed line indicates TPM=1. (B) Schema illustrating the number of expressed genes, stage-variant genes (SVGs), quantified proteins, and stage-variant proteins (SVPs). The sum of gene–protein pairs differs from the number of genes and proteins identified separately as there are cases of a single gene ID (AGI) associated with more than one Uniprot code, and vice versa. (C) Scatter plot of protein abundance and RNA expression level for all RNA–protein pairs at every time point. Colored by RNA–protein correlation (Spearman’s rank coefficient, ρ). Positive if ρ≥0.4. Negative if ρ≤ –0.4. Significant if P-value <0.05. The coefficient of determination R2 was calculated for the data points represented in the graph.
Fig. 7.
Fig. 7.
Correlation and trajectory patterns for gene–protein pairs. (A) Spearman’s rank correlation coefficient (ρ) between RNA and protein levels of each pair depending on the SV–NV classification. Red lines, median ρ of each subset. Dashed lines indicate the limits considered for positive and negative correlation. Circles represent ρ values of ‘supermarkers’ (black), markers (pink), and AP1 targets (gray). Squares represent the ρ for the markers and AP1 targets depicted in (B). (B) Z-scored RNA and protein abundances (Z-scored separately) of selected genes: the seven ‘supermarkers’ (ρ≥0.4) (SVG–SVP: AP2, AP3, PI, TFL1, CRC, FIL-YAB1; SVG–NVP: LFY), two markers with ρ≤ –0.8 (SVG–SVP: WOX13, AT4G27980), one marker and AP1 target with ρ≥0.8 (SVG–SVP: SOC1), and two AP1 targets with non-significant ρ (NVG–NVP: LUT1, PYL1).
Fig. 8.
Fig. 8.
Trajectory patterns for gene–protein pairs. Trajectory clustering (WGCNA) for SVG–SVP (18 modules) (A), NVG–SVP (18 modules) (B), and SVG–NVP (25 modules) (C) gene–protein pairs. Black bar graphs on the right side of each heatmap indicate the number of gene–protein pairs included in each module. The average ρ value for gene–protein pairs included in each SVG–SVP (A) module is shown; for the NVG–SVP (B) and SVG–NVP (C) modules, this value was between –0.4 and 0.4 in all cases (no-correlation, ‘gray’).
Fig. 9.
Fig. 9.
Protein–protein interaction clusters. Network depicting physical interactions and co-expression in interaction clusters between proteins included in the LC-MS/MS dataset and other proteins in Arabidopsis (IntAct, STRING). The five largest interaction clusters (Clusters 1–5) are shown, as well as two interaction clusters that only contain proteins from a specific metabolic pathway (Clusters 13 and 16). The main KEGG pathways for each cluster are shown (see also Supplementary Table S10). Clusters 1, 3, and 5 contain proteins involved in developmental processes and stress responses, whereas clusters 2, 4, 13, and 16 are enriched in proteins related to metabolic pathways. Node border represents RNA levels, and node inner color represents protein levels. Blue, decreasing trajectories; red, increasing trajectories; salmon, trajectories with a maximum peak (increase–decrease); light blue, trajectories with a minimum (decrease–increase); gray, non-variant. Proteins not included in the LC-MS/MS dataset are depicted with rectangles.

References

    1. Akoglu H. 2018. User’s guide to correlation coefficients. Turkish Journal of Emergency Medicine 18, 91–93. - PMC - PubMed
    1. Alvarez-Urdiola R, Matus JT, Riechmann JL.. 2023. Multi-omics methods applied to flower development. Methods in Molecular Biology 2686, 495–508. - PubMed
    1. Bai B, van der Horst N, Cordewener JH, America AHP, Nijveen H, Bentsink L.. 2021. Delayed protein changes during seed germination. Frontiers in Plant Science 12, 735719. - PMC - PubMed
    1. Beer LA, Liu P, Ky B, Barnhart KT, Speicher DW.. 2017. Efficient quantitative comparisons of plasma proteomes using label-free analysis with MaxQuant. Methods in Molecular Biology 1619, 339–352. - PMC - PubMed
    1. Benjamini Y, Hochberg Y.. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B: Statistical Methodology 57, 289–300.

MeSH terms

Grants and funding

LinkOut - more resources