Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 7;11(8):jkab177.
doi: 10.1093/g3journal/jkab177.

New genomic resources and comparative analyses reveal differences in floral gene expression in selfing and outcrossing Collinsia sister species

Affiliations

New genomic resources and comparative analyses reveal differences in floral gene expression in selfing and outcrossing Collinsia sister species

Lauren J Frazee et al. G3 (Bethesda). .

Abstract

The evolutionary transition from outcross- to self-fertilization is one of the most common in angiosperms and is often associated with a parallel shift in floral morphological and developmental traits, such as reduced flower size and pollen to ovule ratios, known as the "selfing syndrome." How these convergent phenotypes arise, the extent to which they are shaped by selection, and the nature of their underlying genetic basis are unsettled questions in evolutionary biology. The genus Collinsia (Plantaginaceae) includes seven independent transitions from outcrossing or mixed mating to high selfing rates accompanied by selfing syndrome traits. Accordingly, Collinsia represents an ideal system for investigating this parallelism, but requires genomic resource development. We present a high quality de novo genome assembly for the highly selfing species Collinsia rattanii. To begin addressing the basis of selfing syndrome developmental shifts, we evaluate and contrast patterns of gene expression from floral transcriptomes across three stages of bud development for C. rattanii and its outcrossing sister species Collinsia linearis. Relative to C. linearis, total gene expression is less variable among individuals and bud stages in C. rattanii. In addition, there is a common pattern among differentially expressed genes: lower expression levels that are more constant across bud development in C. rattanii relative to C. linearis. Transcriptional regulation of enzymes involved in pollen formation specifically in early bud development may influence floral traits that distinguish selfing and outcrossing Collinsia species through pleiotropic functions. Future work will include additional Collinsia outcrossing-selfing species pairs to identify genomic signatures of parallel evolution.

Keywords: Collinsia; DESeq2; Hi-C scaffolding; RNA-seq; dichogamy; differential gene expression; evolutionary genomics; floral development; parallel evolution; pollen; selfing syndrome.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Phylogeny of Collinsia and sister taxa Tonella based on Bayesian phylogenetic analysis. Figure adapted from Baldwin et al. (2011). Branch lengths are scaled to time, in million years (black scale bar). Maximal basal calibration is at 15 Ma. Pictures of Tonella and Collinsia flowers belonging to sister species illustrating the repeated transition from large-flowered outcrossing species to small-flowered selfing species within both genera are shown. Pictures are to scale (purple scale bar). Focal species C. rattanii and its sister species C. linearis are highlighted in purple. Photos by Thomas Harper.
Figure 2
Figure 2
Distribution of C. rattanii genes across 7 chromosomes. The 7 major chromosomes are represented in each sub-figure by 7 horizontal gray lines. (A) All gene density with values calculated at 1 Mb windows. The distribution of 45,542 genes is shown; the remaining 1668 genes are located on un-scaffolded contigs. (B, C) Physical maps of functionally annotated C. rattanii genes, using GO (gene ontology) terms for biological processes. Chromosome numbers are shown to the left of each chromosome (shown as gray lines). (B) 10,708 genes (i.e., vertical black lines) with one or more GO term annotations for biological processes; another 156 such genes are located on un-scaffolded contigs. (C) Genes annotated with general GO terms for pollination (15 genes in yellow), reproduction (57, in pink), development (31, in green), and growth (20, in black). Some lines may visually overlap one another.
Figure 3
Figure 3
Overall gene expression by species and bud stage. Principal component analysis (PCA) plot of regularized log-transformed RNA-seq read counts for 47,210 (38,589 expressed) genes in the C. rattanii genome. N = 3 for each species-stage combination. Principal component 1 (PC1) represents 50% of the variance in the read count dataset. (A) PC2 represents 14% of the variance in the read count dataset. Data points are jittered to reduce overlapping. (B) PC3 represents 7% of the variance in the read count dataset.
Figure 4
Figure 4
Overall gene expression by species and coordinates. Scatterplots with linear trend lines showing relationships between principal components coordinates and bud stages for C. rattanii and C. linearis. Each circle represents one sample (replicate). Red, bold trendlines are statistically significant (P <0.01); dotted trendlines are not significant.
Figure 5
Figure 5
Differential expression of interaction effect genes. Heatmap and hierarchical clustering of regularized log-transformed RNA-seq read counts for the 855 interaction effect genes in the C. rattanii genome. Expression levels were further standardized within each row (i.e., within each gene) by subtracting the row mean from each value in a row and dividing those values by the row standard deviation, resulting in a mean of 0 and a standard deviation of 1 for each row. Each row represents one gene’s full expression profile. Red = over expression; blue = under expression. N = 3 for each species-stage combination.
Figure 6
Figure 6
Interaction effect gene expression patterns by biological process. Heatmap of regularized log-transformed RNA-seq read counts (averaged across biological replicates) for the 91-gene subset of interaction effect genes that are also indirectly or directly annotated with the 11 (but see note below) significant GO terms as presented in Table 5. Expression levels were further standardized within each row (i.e., within each gene) by subtracting the row mean from each value in a row and dividing those values by the row standard deviation, resulting in a mean of 0 and a standard deviation of 1 for each row. Each row represents one gene’s full expression profile. Red = overexpression; blue = under expression. *Note: These genes are annotated with 2 different significant GO terms, but only appear here in association with one. Additional annotations include: “sporopollenin biosynthetic pathway” for RCORA09826 and RCORA186672; “lipid metabolic process” for RCORA14962; “callose deposition in cell wall” for RCORA36422 and RCORA205649; and “regulation of transcription, DNA-templated” for RCORA141603, RCORA61416, and RCORA41810.
Figure 7
Figure 7
Magnitude and direction of expression changes in early and late development. Heatmaps representing magnitude and direction of significant expression change between bud stages in 164 early and 418 late development contrast effect genes for each species and hierarchical clustering of regularized log-transformed RNA-seq read counts for each gene. Red = positive expression change; blue = negative expression change.
Figure 8
Figure 8
Contrast effect gene expression patterns by biological process. Expression profiles for 17 genes showing significantly different patterns between the species in either early or late development and belonging to the 6 enriched GO categories corresponding to Table 6. RNA-seq read counts (y-axis) for each replicate are presented as either triangles (C. linearis) or circles (C. rattanii); lines connect species-specific count means. On the x-axis, stages in red (bold) indicate whether significant differences between species in gene expression profiles occur early (B1–B2) or late (B2–B3) in development. Note: Since GO term “sporopollenin biosynthetic process” is a subprocess of “pollen exine formation,” we categorize gene profiles according to 5 GO terms. *These two genes are also annotated by the significantly enriched GO term “sporopollenin biosynthetic process.”

References

    1. Alexa A, Rahnenführer J.. 2019. topGO: Enrichment Analysis for Gene Ontology. R package version 2.44.0.
    1. Alexa A, Rahnenführer J, Lengauer T.. 2006. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 22:1600–1607. - PubMed
    1. Andrews S. 2010. FastQC: a quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc.
    1. Armbruster WS, CP, Mulder CP, BG, Baldwin S, Kalisz B, Wessa, et al. 2002. Comparative analysis of late floral development and mating‐system evolution in tribe Collinsieae (Scrophulariaceae sl. Am J Bot. 89:37–49 - PubMed
    1. Baldwin BG, Kalisz S, Armbruster WS.. 2011. Phylogenetic perspectives on diversification, biogeography, and floral evolution of Collinsia and Tonella (Plantaginaceae). Am J Bot. 98:731–753. - PubMed

Publication types