Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 1;37(7):koae277.
doi: 10.1093/plcell/koae277.

Expansion of the MutS gene family in plants

Affiliations

Expansion of the MutS gene family in plants

Daniel B Sloan et al. Plant Cell. .

Abstract

The widely distributed MutS gene family functions in recombination, DNA repair, and protein translation. Multiple evolutionary processes have expanded this gene family in plants relative to other eukaryotes. Here, we investigate the origins and functions of these plant-specific genes. Cyanobacterial-like MutS1 and MutS2 genes were ancestrally gained via plastid endosymbiotic gene transfer. MutS1 was subsequently lost in seed plants, whereas MutS2 was duplicated in Viridiplantae (i.e. land plants and green algae). Viridiplantae also have 2 anciently duplicated copies of the eukaryotic MSH6 gene and acquired MSH1 via horizontal gene transfer-potentially from a nucleocytovirus. Despite sharing a name, "plant MSH1" is not directly related to the MSH1 gene in some fungi and animals, which may be an ancestral eukaryotic gene acquired via mitochondrial endosymbiosis and subsequently lost in most eukaryotes. There has been substantial progress in understanding the functions of plant MSH1 and MSH6 genes, but the cyanobacterial-like MutS1 and MutS2 genes remain uncharacterized. Known functions of bacterial homologs and predicted protein structures, including fusions to diverse nuclease domains, provide hypotheses about potential molecular mechanisms. Because most plant-specific MutS proteins are mitochondrial and/or plastid-targeted, the expansion of this family has played a large role in shaping plant organelle genetics.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement. None declared.

Figures

Figure 1.
Figure 1.
The plant MutS protein family. A) Summary of domain architecture for representatives of each of the 8 MutS subfamilies that are found in plants, highlighting fusions with different predicted nuclease domains. Arabidopsis proteins are shown as representatives for each MutS type except MutS1 for which the moss Physcomitrium is used because Arabidopsis and other seed plants appear to have lost the cyanobacterial-like MutS1 that was likely acquired from plastids. Domain annotation was based on InterProScan except that Domains II-IV were not detected in Arabidopsis MSH1 and were added manually based on sequence alignment and structural homology and that the Mrr_cat domain was predicted by CD-Search Tool. Despite the identified similarity to an Mrr_cat domain, actual nuclease activity may be unlikely (see main text). B) Structures of bacterial MutS1 (E. coli PDB: 7AI5) and bacterial MutS2 (Thermus thermophilus PDB: 7VUF) dimers, showing examples of MutS enzymes both with (MutS1) and without (MutS2) the N-terminal mismatch recognition and connector domains. Color coding reflects domain architecture shown in panel A. Bacterial MutS1 is divided into 5 defined domains: mismatch binding domain (Domain I, residues 1 to 115, deep purple), connector domain (Domain II, residues 116 to 266, blue), core domain (Domain III, 267 to 443 and 504 to 567, teal), clamp and levers domain (Domain IV, 444 to 503, green), and ATPase domain plus helix-turn-helix domain (Domain V, 568 to 765 and 765 to 800, yellow) (Bhairosing-Kok 2021; Fernandez-Leiro et al. 2021). Bacterial MutS2 is divided into 3 defined domains: core domain (Domain III, residues 1 to 131 and 248 to 277, teal), clamp and levers domain (Domain IV, residues 132 to 247, green), and ATPase domain plus helix-turn-helix domain (Domain V, residues 248 to 486, yellow) (Fukui et al. 2022). The absence of domain I and II in bacterial MutS2 creates a DNA binding site of more than 70 Å able to accommodate Holliday junctions or D-loops (Fukui et al. 2022). C) Unrooted maximum-likelihood phylogenetic tree showing each of the 8 MutS subfamilies in which plant representatives have been identified. Plant lineages are highlighted in green. Clades with bootstrap support >80% are indicated with circles. Branch lengths here and elsewhere represent amino acid substitutions per site. See Methods for information on tree reconstruction, domain prediction, and visualization of protein structures.
Figure 2.
Figure 2.
Viridiplantae maintain ancient duplicates of the MSH6 and MutS2 genes. Maximum-likelihood phylogenetic trees based on sequences of A) MSH6 and B) MutS2 proteins. In both cases, 2 ancient gene copies are conserved across all sampled green algae and land plants (highlighted clades). Bootstrap values are reported for clades with >50% support. MutS1 and MSH3 sequences were used as outgroups to root the MSH6 tree, and MutS1 and MSH5 were used as outgroups to root the MutS2 tree. See Methods for information on tree reconstruction.
Figure 3.
Figure 3.
Origins and diversity of MutS proteins with mitochondrial and/or plastid function in eukaryotes. A) Unrooted maximum-likelihood phylogeny with representatives of the MutS subfamilies found in eukaryotes. Sampling emphasizes bacterial-like MutS1 proteins that include the mitochondrial-targeted MSH1 protein that has been described in yeast and other eukaryotes. Representatives of other MutS subfamilies are collapsed into clades (triangles). The indicated grouping of alphaproteobacterial-like MutS1/MSH1 proteins includes representative from multiple eukaryotic supergroups: Amorphea (Amoebozoa, animals, choanoflagellates, and fungi), TSAR (stramenopiles), Discoba (Andalucia and Naegleria), and Archaeplastida (with the genus Galdieria being the only identified representative). The clade of bacteria within this encircled group are all Alphaproteobacteria. This analysis did not recover a single MSH1 clade that was nested within the Alphaproteobacteria or that was monophyletic and sister to the Alphaproteobacteria, but that may reflect distortions from long branches and other artefacts. The eukaryotes within this group generally exhibit the strongest similarity to Alphaproteobacteria when searched against all bacteria in the NCBI nr database with BLASTP, and previous phylogenetic analyses with more extensive sampling of bacterial diversity have also recovered affinities for Alphaproteobacteria (Hofstatter and Lahr 2021). An independent plastid-derived origin of MutS1-like proteins in eukaryotes is also indicated, as the encircled group shows representatives from Cyanobacteria and Archaeplastida (see Fig. 4 for details). The tree also shows that these MutS1-like proteins are not directly related to the organelle-targeted “plant” MSH1 protein or to MutS7 proteins, which are found in mitochondria of octocorals, as well as some nucleocytoviruses and Epsilonproteobacteria. Clades with bootstrap support >80% are indicated with circles. B) Maximum-likelihood phylogeny of the “plant” MSH1 clade shows that these proteins are found in lineages outside of Viridiplantae, including various protists, a small clade of Gammaproteobacteria, and some nucleocytoviruses. This disjunct phylogenetic distribution is clear evidence of HGT, but reconstruction of the specific ancient transfer events is challenging given the level of resolution for deep nodes in the tree (numerical values indicate bootstrap support) and the potential for long-branch artefacts. For a more extensive sampling of “plant” MSH1 diversity within the Viridiplantae, see Bai and Guo (2023). MutS1 and MSH2 representatives were used to root the tree. See Methods for information on tree reconstruction, domain prediction, and visualization of protein structures.
Figure 4.
Figure 4.
A MutS1-like gene in plants. The major Archaeplastida lineages (glaucophytes, red algae, and Viridiplantae) all have representatives with a gene that shows similarities to cyanobacterial MutS1, although this gene appears to be absent from seed plants. The maximum-likelihood phylogenetic tree (left) is based on MutS1 protein sequences, using MSH2 for rooting. Bootstrap values are reported for clades with >50% support. The summary of domain architectures (center) reports the classic configuration of MutS1 domains, as well as fusion with an N-terminal DPD1 protein in many representatives of the Viridiplantae lineage or a C-terminal domain with similarities to a Mrr_cat restriction endonuclease domain in Cyanophora (but see main text for reasons to be skeptical that this domain functions as a nuclease in Cyanophora). Organelle targeting values (right) reflect the probability of subcellular localization to the mitochondria or plastids based on computational predictions by TargetP v2.0 (Armenteros et al. 2019). Many (but not all) MutS1 proteins in the Archaeplastida lineage have some evidence of mitochondrial and/or plastid localization. As expected, none of the bacterial or MSH2 sequences have any predicted organelle targeting. See Methods for information on tree reconstruction, domain prediction, and protein targeting analysis.
Figure 5.
Figure 5.
Structural features of cyanobacterial-like MutS1 proteins in plants. A) Aligned motif within MutS1 Domain I, highlighting the Phe36 and Glu38 residues that are required for mismatch recognition. Only variants relative to the consensus sequence are shown. The substitutions (shown in red) at these positions in the streptophytic alga Zygnema and the ferns Adiantum and Ceratopteris suggest that the MutS1 proteins in these species may have compromised MMR activity. B) Structure of E. coli MutS1 MMR domain (light blue) in complex with dsDNA (orange) with a mismatched G-T base-pair (highlighted in black) from PDB accession 7AI6. The Phe36 and Glu38 residues (highlighted in red) directly interact with this mismatch. The highlighted nucleotides and amino acid residues are also shown in ball-stick representation. C) Structural model of the N-terminal DPD1 domain (light blue) from the Physcomitrium patens MutS1 protein superimposed on the dsDNA substrate (orange) from the crystal structure of TREX2 (PDB 6A47). Predicted active-site residues (Asp276, Glu278, Asp370, His431, and Asp436) are highlighted in red and shown in ball-stick representation. The identity of these residues is conserved in all sampled plant DPD1-MutS1 proteins.
Figure 6.
Figure 6.
History of MutS gene family expansion in plants with the inferred timing of gene gains, losses, and duplications indicated on the tree. Gene presence or absence in a lineage is indicated by boxes and dots, respectively. In general, phylogenetic reconstructions of deep splits in MutS gene trees are not well supported or are prone to long-branch artefacts, so alternative scenarios are possible. For example, the MSH6 and MutS2 duplications could have occurred earlier and been followed by losses in lineages outside of Viridiplantae. A MutS1 loss is indicated for chlorophytes because it appears to be absent from most sampled species in this lineage. However, the presence of MutS1 in Cymbomonas tetramitiformis indicates that it is not entirely absent from all chlorophytes. This figure was generated with Biorender.

Update of

Similar articles

Cited by

References

    1. Abdelnoor R. Cloning and characterization of MSH1 in higher plants and its involvement in regulation of substoichiometric shifting. Lincoln, NE: University of Nebraska; 2004.
    1. Abdelnoor RV, Christensen AC, Mohammed S, Munoz-Castillo B, Moriyama H, Mackenzie SA. Mitochondrial genome dynamics in plants and animals: convergent gene fusions of a MutS homologue. J Mol Evol. 2006:63(2):165–173. 10.1007/s00239-005-0226-9 - DOI - PubMed
    1. Abdelnoor RV, Yule R, Elo A, Christensen AC, Meyer-Gauen G, Mackenzie SA. Substoichiometric shifting in the plant mitochondrial genome is influenced by a gene homologous to MutS. Proc Natl Acad Sci. 2003:100(10):5968–5973. 10.1073/pnas.1037651100 - DOI - PMC - PubMed
    1. Adé J, Belzile F, Philippe H, Doutriaux MP. Four mismatch repair paralogues coexist in Arabidopsis thaliana: AtMSH2, AtMSH3, AtMSH6-1 and AtMSH6-2. Mol Gen Genet. 1999:262(2):239–249. 10.1007/pl00008640. - DOI - PubMed
    1. Aravind L, Makarova KS, Koonin EV. Survey and summary: Holliday junction resolvases and related nucleases: identification of new families, phyletic distribution and evolutionary trajectories. Nucleic Acids Res. 2000:28(18):3417–3432. 10.1093/nar/28.18.3417 - DOI - PMC - PubMed

LinkOut - more resources