Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar;57(3):729-740.
doi: 10.1038/s41588-024-02071-4. Epub 2025 Feb 17.

The Marchantia polymorpha pangenome reveals ancient mechanisms of plant adaptation to the environment

Collaborators, Affiliations

The Marchantia polymorpha pangenome reveals ancient mechanisms of plant adaptation to the environment

Chloé Beaulieu et al. Nat Genet. 2025 Mar.

Abstract

Plant adaptation to terrestrial life started 450 million years ago and has played a major role in the evolution of life on Earth. The genetic mechanisms allowing this adaptation to a diversity of terrestrial constraints have been mostly studied by focusing on flowering plants. Here, we gathered a collection of 133 accessions of the model bryophyte Marchantia polymorpha and studied its intraspecific diversity using selection signature analyses, a genome-environment association study and a pangenome. We identified adaptive features, such as peroxidases or nucleotide-binding and leucine-rich repeats (NLRs), also observed in flowering plants, likely inherited from the first land plants. The M. polymorpha pangenome also harbors lineage-specific accessory genes absent from seed plants. We conclude that different land plant lineages still share many elements from the genetic toolkit evolved by their most recent common ancestor to adapt to the terrestrial habitat, refined by lineage-specific polymorphisms and gene family evolution.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. M. polymorpha collection, phylogenetic tree and population structure.
a, Pictures of representative M. polymorpha accessions. Tak-1, CA, BoGa, Gen-B, RES49 and Nor-E belong to M. polymorpha ssp. ruderalis. RES52 belongs to M. polymorpha ssp. montivagans. b, Phylogenetic tree of the M. polymorpha subspecies complex, comprising 104 M. polymorpha ssp. ruderalis, 16 M. polymorpha ssp. montivagans and 13 M. polymorpha ssp. polymorpha accessions. The phylogenetic tree was computed using a dataset of 107,744 pruned SNPs. Dots on the world map indicate the sampling location, and colors represent country or geographic regions of origin. Some sampling locations are linked to their tree branch to give examples of the discrepancy between geographical origin and phylogenetic position. The colored branches on the tree also correspond to the geographic origin. Bar plots on the top of the tree indicate the assignment probability of each M. polymorpha accession to each of the five main genetic groups (the three subspecies and three populations in the ruderalis subspecies, colored in shades of blue and green). These clusters were identified by the Bayesian clustering algorithm implemented in the fastSTRUCTURE software. Map data from CIA World DataBank II (https://www.evl.uic.edu/pape/data/WDB/).
Fig. 2
Fig. 2. Genome-wide distribution and allele frequency spectrum of the main functional variant categories and functional enrichment analysis of genes putatively under background selection or balancing selection in a collection of 104 accessions of M. polymorpha ssp. ruderalis.
In a, a total of 5,439,674 SNPs (4,363,099 intergenic, 509,964 in introns, 115,218 missense, 88,163 synonymous, 183,858 3′ UTR, 154,821 5′ UTR and 24,551 5′ UTR premature start codon gain, that is, a variant in a 5′ UTR region producing a three-base sequence that can be a start codon) were used to construct the allele frequency spectrum. In b, a total of 7,823 SNPs (4,010 stop gained, 1,948 stop lost, 608 stop retained, 606 splice acceptor, 385 splice donor and 266 start lost) were used to construct the allele frequency spectrum. All SNPs in a,b are distributed on the eight autosomes and sexual chromosome U or V. In c, functional enrichment analysis of a list of 777 genes putatively under background selection (top 10% of genes by Zheng’s E negative value). In d, functional enrichment analysis of a list of 1,814 genes putatively under balancing selection (top 10% of genes by Tajima’s D positive values). In c,d, IPR terms are ordered from top to bottom according to decreasing significance (false discovery rate q value < 0.05), together with some associated known gene names.
Fig. 3
Fig. 3. Identification of loci associated with climatic conditions by gene–environment association analyses in M. polymorpha ssp. ruderalis.
a, Principal-component (PC) plot of 96 M. polymorpha ssp. ruderalis individuals based on temperature-linked bioclimatic variables (BIO1 to BIO11). Var., variance. b, Contributions of temperature-linked bioclimatic variables (BIO1 to BIO11) to the two first principal components. c, Miami plots (mirror comparison of two Manhattan plots) of the GEA results on the first components of distinct PCA for two categories of bioclimatic variables from the WorldClim 2 data: temperature-linked (BIO1 to BIO11) and precipitation-linked (BIO12 to BIO19) variables. These Miami plots result from a classical genome-wide association analysis performed with GEMMA, followed by use of the local score technique on SNP P values (bilateral Wald test) to amplify the signal between SNPs in LD. The result of this process is a Lindley value that has been used instead of a P value to plot the y values of the Manhattan plots. The dashed lines represent significance thresholds for each chromosome (resampling thresholds from the local score method). TF, transcription factor.
Fig. 4
Fig. 4. An atypical kinase linked to the response of M. polymorpha to temperature and precipitation variation.
a, Haplotype block illustration of the genomic region on chromosome 8 associated with both temperature-linked and precipitation-linked variables in M. polymorpha ssp. ruderalis and containing two genes: one coding for an ABC1 atypical kinase (Mp8g04680) and one for a cytochrome C1 protein (Mp8g04690). The block on the left side of the figure represents bioclimatic variable values at the sampling site of each accession that led to the identification of the ABC1 cytochrome C1 region via GEA. The main matrix represents the allelic status of the SNPs in this region for each accession. The clustered matrix was generated with the ComplexHeatmap R package. Precip, precipitation; temp, temperature; srad, solar radiation; prec, precipitation. b, Close-up view of Tajima’s D sliding window analysis in the genomic region surrounding the ABC1K gene of M. polymorpha ssp. ruderalis (Mp8g04680) on the genome browser (https://marchantia.info/variants/). This view highlights the specificity of balancing selection (high Tajima’s D values) observed in the coding sequence of Mp8g04680 and in its promoter region compared to the local genomic region (50- and 200-kb windows at the top and bottom, respectively). The gold boxes represent the genomic location of Mp8g04680. M, million. c, Phylogenetic tree of the orthologs of the ABC1K gene of M. polymorpha ssp. ruderalis illustrating direct orthology with the ABC1K7 gene in A. thaliana. The names of the different species from which the ABC1K genes originate are specified with a six-letter code (Ulvmut, Ulva mutabilis; Chlrei, Chlamydomonas reinhardtii; TrespOTU1, Trebouxia spOTU1; Klenit, Klebsormidium nitens; Chabra, Chara braunii; Mesend, Mesotaenium endlicherianum; Selmoe, Selaginella moellendorffii; AntagrOXF, Anthoceros agrestis cv. oxford; Marpal, M. paleacea; Marpolrud, M. polymorpha ssp. ruderalis; Phypat, Physcomitrium patens; Cerpur, Ceratodon purpureus; Adicap, Adiantum capillus-veneris; Nymcol, Nymphaea colorata; Ambtri, Amborella trichopoda; Phaequ, Phalaenopsis equestris; Bradis, Brachypodium distachyon; Spipol, Spirodela polyrhiza; Zosmar, Zostera marina; Procyn, Protea cynaroides; Aqucoe, Aquilegia coerulea; Helann, Helianthus annuus; Daucar, Daucus carota; Camsin, Camellia sinensis; Budalt, Buddleja alternifolia; Sollyc, Solanum lycopersicum; Jugreg, Juglans regia; Drydru, Dryas drummondii; Aratha, A. thaliana; Medtru, M. truncatula; Cepfol, Cephalotus follicularis; Thecac, Theobroma cacao; Betpat, Beta patula; Cucsat, Cucumis sativus; Poptri, Populus trichocarpa; Eucgra, Eucalyptus grandis).
Fig. 5
Fig. 5. Landscape of gene presence–absence variation in the Marchantia genus pangenome.
a, HOGs are sorted by their occurrence, with the genes shared by all the Marchantia species on the left (separated from the rest by a dashed line) and the very rare genes on the right part of the matrix. The accessions (in rows) are sorted according to their position in the phylogenetic tree of the accessions generated by the software OrthoFinder to infer these HOGs. Colors represent the geographic origin of each accession. b, Genetic features and traits shared by all species in the Marchantia genus.
Fig. 6
Fig. 6. The pangenome of M. polymorpha ssp. ruderalis.
a, Increase in pangenome size and decrease in core genome size when adding accessions to the M. polymorpha ssp. ruderalis pangenome (500 random samplings for the box plot, for which the center is the median of values, bounds of the boxes are the first and third quantiles, that is, the 25th and 75th percentiles, and the whiskers are distant from the quantiles by 1.5 times the interquantile range). This saturation plot indicates a closed pangenome in M. polymorpha ssp. ruderalis. b, Composition of the pangenome regarding the HOG categories (core, accessory and rare). The histogram shows the number of HOGs with different frequencies of presence in accessions, and the pie chart shows the proportion of HOGs in each category. c, Bar plot representing IPR terms significantly enriched in the accessory genome. Redundant IPR terms and IPR terms linked with transposable elements and viral proteins were discarded to improve readability. d, Phylogeny of the orthologs of fungal lectin genes present in M. polymorpha ssp. ruderalis.

References

    1. Delaux, P.-M. & Schornack, S. Plant evolution driven by interactions with symbiotic and pathogenic microbes. Science371, eaba6605 (2021). - PubMed
    1. Morris, J. L. et al. The timescale of early land plant evolution. Proc. Natl Acad. Sci. USA115, E2274–E2283 (2018). - PMC - PubMed
    1. Beerling, D. The Emerald Planet: How Plants Changed Earth’s History10.1093/oso/9780192806024.001.0001 (Oxford University Press, 2007).
    1. Rich, M. K. et al. Lipid exchanges drove the evolution of mutualism during plant terrestrialization. Science372, 864–868 (2021). - PubMed
    1. Clark, J. W. et al. The origin and evolution of stomata. Curr. Biol.32, R539–R553 (2022). - PubMed