Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 7;39(10):msac206.
doi: 10.1093/molbev/msac206.

Recurrent but Short-Lived Duplications of Centromeric Proteins in Holocentric Caenorhabditis Species

Affiliations

Recurrent but Short-Lived Duplications of Centromeric Proteins in Holocentric Caenorhabditis Species

Lews Caro et al. Mol Biol Evol. .

Abstract

Centromeric histones (CenH3s) are essential for chromosome inheritance during cell division in most eukaryotes. CenH3 genes have rapidly evolved and undergone repeated gene duplications and diversification in many plant and animal species. In Caenorhabditis species, two independent duplications of CenH3 (named hcp-3 for HoloCentric chromosome-binding Protein 3) were previously identified in C. elegans and C. remanei. Using phylogenomic analyses in 32 Caenorhabditis species, we find strict retention of the ancestral hcp-3 gene and 10 independent duplications. Most hcp-3L (hcp-3-like) paralogs are only found in 1-2 species, are expressed in both males and females/hermaphrodites, and encode histone fold domains with 69-100% identity to ancestral hcp-3. We identified novel N-terminal protein motifs, including putative kinetochore protein-interacting motifs and a potential separase cleavage site, which are well conserved across Caenorhabditis HCP-3 proteins. Other N-terminal motifs vary in their retention across paralogs or species, revealing potential subfunctionalization or functional loss following duplication. An N-terminal extension in the hcp-3L gene of C. afra revealed an unprecedented protein fusion, where hcp-3L fused to duplicated segments from hcp-4 (nematode CENP-C). By extending our analyses beyond CenH3, we found gene duplications of six inner and outer kinetochore genes in Caenorhabditis, which appear to have been retained independent of hcp-3 duplications. Our findings suggest that centromeric protein duplications occur frequently in Caenorhabditis nematodes, are selectively retained for short evolutionary periods, then degenerate or are lost entirely. We hypothesize that unique challenges associated with holocentricity in Caenorhabditis may lead to this rapid "revolving door" of kinetochore protein paralogs.

Keywords: centromeric histone; gene duplication; kinetochore; protein motifs.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Ten independent hcp-3 duplications in Caenorhabditis species. A schematic representation of ancestral centromeric histone genes (hcp-3, black) and their duplicates (hcp-3L, blue) are shown alongside a Caenorhabditis species tree (adapted from http://caenorhabditis.org). hcp-3 duplication events are represented on the species tree with a blue dot and numbered L1 through L10, with paralogs arising from independent duplications assigned different numbers. Genes in the syntenic neighborhood near hcp-3 and hcp-3L are represented in gray and labeled with their orthologous gene names in C. elegans. In some cases, 1–3 genes were inserted between hlh-11 and F58A4.6 within the syntenic neighborhood of hcp-3. The white arrow with a question mark represents a possible loss of hcp-3L2 in C. zanzibari. Ends of genomic scaffolds are denoted with two slashes. On the right, we show percent amino acid identities between the paralog and ancestral hcp-3 of each species (in the N-terminal tail or HFD).
Fig. 2.
Fig. 2.
Phylogenetic analysis of hcp-3 and hcp-3L genes from Caenorhabditis species. A maximum likelihood tree of a DNA, codon-based alignment of the HFD of ancestral hcp-3 (black) and hcp-3 paralogs (blue) is shown. Bootstrap values of 40 and above are indicated. Bootstrap values in parentheses are from corresponding nodes from a maximum likelihood tree based on an amino acid alignment of the HFD (see supplementary fig. S2, Supplementary Material online). In all except a few instances, the nucleotide and amino acid tree are in agreement, with higher bootstrap support observed in the nucleotide tree. For the exceptions (nodes representing hcp-3 or hcp-3L2 in C. tribulationis, C. sinica, C. sp41, and C. zanzibari, and the node representing hcp-3 in C. nouraguensis, C. becei, and C. macrosperma), bootstraps values were not included here since they were lower in the amino acid tree and because they do not alter conclusions from the nucleotide tree. A scale bar (branch lengths, substitutions per site) is shown at the bottom-right. On the right, thick lines show hcp-3 paralogs from same species, the dashed line shows the second duplicate found in C. species 48.
Fig. 3.
Fig. 3.
hcp-3L genes are expressed in both sexes in Caenorhabditis species. RT-PCR of ancestral hcp-3 (top), hcp-3L (middle), or tbb-2 (bottom; loading control) in species with hcp-3 duplicates. RNA from a mixed worm population of various larval stages, L4 or young adult females/hermaphrodites or L4 or young adult males were used.
Fig. 4.
Fig. 4.
Differential retention of N-terminal tail motifs across HCP-3 and HCP-3L proteins encoded by Caenorhabditis species. (A) Logo plots of 11 protein motifs within HCP-3 N-terminal tails discovered from an analysis of Caenorhabditis species without duplications. Motifs 12 and 13 are C-terminal motifs (not shown, see supplementary fig. S4, Supplementary Material) that reside within the HFD. The e-values of all motifs were below 10−5. Asterisks above logo plots for motifs 1, 3, and 4 indicate residues that are highly conserved within the motif. Proportion of all 32 ancestral HCP-3 proteins (black) or 14 HCP-3L duplicates (blue) that have retained the motifs are shown. (B) Caenorhabditis species tree with schematics of protein motifs that are present (numbered boxes) in ancestral HCP-3 (black) or HCP-3L (blue) in each species is shown. The presence of motif 1 in C. elegans and motif 4 in C. sp54 was not detected by unsupervised MAST searches but was subsequently ascertained through manual alignments (see supplementary data S4, Supplementary Material). All proteins contained a conserved, C-terminal HFD (not shown). Filled black boxes represent three motifs that show the highest retention in Caenorhabditis HCP-3 proteins. A structure of the N-terminal tail of HCP-3 in the last common ancestor of Caenorhabditis was inferred based on the retention and loss of motifs in the N-terminal tail. L1–L10 on the species tree indicate hcp-3 duplication events as in Figure 1. A scale bar (number of residues) is shown on the bottom-right.
Fig. 5.
Fig. 5.
Two unusual Caenorhabditis hcp-3L paralogs arose by internal duplication or gene fusion. (A) Schematic of the exon structure (left) and protein motif structure (right) of C. sp54 hcp-3 (top) and hcp-3L3 (bottom). Portions of hcp-3 exon 3 (light blue), exon 4 (dark blue), and exon 5 (orange) are duplicated within the N-terminal tail of hcp-3L3 (dashed arrow). Similarly, motifs 5–10 are duplicated within the N-terminal tail of HCP-3L3. Motif 13 resides within the HFD and is missing in HCP-3L3. The HFD is not within the duplicated region. (B) Schematic of the exon structure of C. afra hcp-3L8 (middle) with homology to C. afra hcp-4 (top) and C. afra hcp-3 (bottom). The first five exons of hcp-3L8 are homologous to C. afra hcp-4 exons 1 and 2 (light red) as well as a portion of exon 3 (dark red). The last five exons of hcp-3L8 are homologous to C. afra hcp-3 (black). The HFD and the N-terminal tail of hcp-3 are denoted. Percent amino acid identity between protein-coding exons are shown. (C) Primers designed to span exons that are homologous to hcp-3 and hcp-4 within hcp-3L8 (top). Schematic of the gene shows primers used to amplify the hcp-4-hcp-3 fusion region (top, blue) in RT-PCR of C. afra hcp-3L8 and tbb-2 in males and females (bottom) to confirm expression of a chimeric transcript. +RT and −RT indicate cDNA preparation with or without reverse transcriptase enzyme, respectively.
Fig. 6.
Fig. 6.
Duplication of kinetochore proteins in Caenorhabditis species. A schematic representation of ancestral (black) and duplicate (gray) copies of seven kinetochore genes (hcp-4, knl-2, knl-1, zwl-1, spdl-1, ndc-80, and him-10) shown alongside a Caenorhabditis species tree. hcp-3 duplication events are denoted as a blue dot on the species tree, as in Figure 1. The unique fusion between C. afra hcp-4 and hcp-3 duplicates is shown in gray and blue. Incomplete sequence information in genomic scaffolds is denoted with i and apparent pseudogenes are denoted as unfilled arrows. Double slash in C. brenneri knl-2 duplicate indicates the sequence was split between two scaffolds. # indicates two potential pseudogenization events in zwl-1 that are likely to represent sequencing errors.

Similar articles

Cited by

References

    1. Akera T, Chmátal L, Trimm E, Yang K, Aonbangkhen C, Chenoweth DM, Janke C, Schultz RM, Lampson MA. 2017. Spindle asymmetry drives non-Mendelian chromosome segregation. Science. 358:668–672. - PMC - PubMed
    1. Ali-Ahmad A, Sekulić N. 2020. CENP-A nucleosome – a chromatin-embedded pedestal for the centromere: lessons learned from structural biology. Essays Biochem. 64:205–221. - PMC - PubMed
    1. Allshire RC, Karpen GH. 2008. Epigenetic regulation of centromeric chromatin: old dogs, new tricks? Nat Rev Genet. 9:923–937. - PMC - PubMed
    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol. 215:403–410. - PubMed
    1. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402. - PMC - PubMed

Publication types