Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 15;150(2):dev201155.
doi: 10.1242/dev.201155. Epub 2023 Jan 18.

NANOGP1, a tandem duplicate of NANOG, exhibits partial functional conservation in human naïve pluripotent stem cells

Affiliations

NANOGP1, a tandem duplicate of NANOG, exhibits partial functional conservation in human naïve pluripotent stem cells

Katsiaryna Maskalenka et al. Development. .

Abstract

Gene duplication events can drive evolution by providing genetic material for new gene functions, and they create opportunities for diverse developmental strategies to emerge between species. To study the contribution of duplicated genes to human early development, we examined the evolution and function of NANOGP1, a tandem duplicate of the transcription factor NANOG. We found that NANOGP1 and NANOG have overlapping but distinct expression profiles, with high NANOGP1 expression restricted to early epiblast cells and naïve-state pluripotent stem cells. Sequence analysis and epitope-tagging revealed that NANOGP1 is protein coding with an intact homeobox domain. The duplication that created NANOGP1 occurred earlier in primate evolution than previously thought and has been retained only in great apes, whereas Old World monkeys have disabled the gene in different ways, including homeodomain point mutations. NANOGP1 is a strong inducer of naïve pluripotency; however, unlike NANOG, it is not required to maintain the undifferentiated status of human naïve pluripotent cells. By retaining expression, sequence and partial functional conservation with its ancestral copy, NANOGP1 exemplifies how gene duplication and subfunctionalisation can contribute to transcription factor activity in human pluripotency and development.

Keywords: Evolution; Gene duplication; Pluripotency; Pseudogene; Reprogramming; Transcription factor.

PubMed Disclaimer

Conflict of interest statement

Competing interests The authors declare no competing or financial interests.

Figures

Fig. 1.
Fig. 1.
NANOGP1 is highly expressed in human naïve pluripotent stem cells and epiblast cells. (A) Ranked expression of 6922 pseudogenes in naïve hPSCs. Analysis was performed using a custom annotation of pseudogenes. The y-axis has been cut off at −4 log2 RPM. (B) Examples of highly expressed pseudogenes in naïve hPSCs. Pseudogenes of pluripotency factors are in dark purple. Analysis performed using a custom annotation of pseudogenes. Data show mean and data points from three independent samples. (C) RNA-seq data for NANOG, SLC2A14 and NANOGP1 in naïve and primed hPSCs (Collier et al., 2017). (D) NANOGP1 expression in naïve (blue) and primed (red) hPSC lines (Guo et al., 2016; Pastor et al., 2016; Takashima et al., 2014; Theunissen et al., 2016). Data show mean and data points from three independent samples (except for the WIBR2 and WIBR3 lines, which have one data point each). (E) NANOG and NANOGP1 expression in human pre-implantation embryos (Petropoulos et al., 2016). 8 cell, eight-cell stage (n=78); Morula (n=185); early ICM, early inner cell mass (n=66); early TE, early trophectoderm (n=227); EPI, epiblast (n=45); PE, primitive endoderm (n=30); TE, trophectoderm (n=715). Horizontal lines indicate the median. (F) NANOG and NANOGP1 expression in epiblast cells from human peri-implantation and early post-implantation cultured embryos (Xiang et al., 2020). Day 6 (n=60); day 7 (n=33); day 8 (n=11); day 9 (n=12); day 10 (n=14); day 12 (n=22); day 14 (n=26). Horizontal lines indicate the median.
Fig. 2.
Fig. 2.
Predicted open reading frame structure of NANOGP1. (A) Splicing analysis of NANOGP1 in naïve hPSCs (Takashima et al., 2014). The numbers in between the RNA-seq peaks indicate the number of times a splicing event was measured. The three predicted patterns of transcript splicing are underneath. (B) Predicted transcript isoforms of NANOGP1, including the size of exons and introns (in bp), and translation start and start codons. The transcript structure of NANOG is shown for comparison. (C) Predicted NANOGP1 open reading frame (ORF) variants and domain structures. The ORF of NANOG is shown for comparison. Differences in the NANOGP1 ORFs versus the NANOG ORF are indicated. Amino acid substitutions caused by missense DNA changes are labelled by red vertical lines; silent changes are labelled by grey vertical lines. 8×W, tryptophan-rich subdomain/region containing eight tryptophan (W) residues; Δ2×W, deletion of two tryptophan residues from the tryptophan-rich subdomain; HD, DNA-binding homeodomain.
Fig. 3.
Fig. 3.
NANOGP1 duplication in human evolution. (A) Top: NANOG/NANOGP1 tandem duplication locus [distance (bp) between the genes/pseudogene]. Bottom: self-alignment of a 250 kb region containing NANOGNB, NANOG, NANOGP1 and another duplicated gene pair, SLC2A14 and SLC2A3 (genes indicated by boxes along x- and y-axes). Individual dots represent matching base pairs between the two aligned sequences. Circles indicate three areas of high sequence conservation between the ancestral and duplicated regions. (B) Sequence similarity and locations of the three regions identified in A (left) and between the exons and upstream regions of NANOG and NANOGP1 (right). (C) Conservation of the NANOG/NANOGP1 tandem duplication locus across species. Predicted duplication dates are indicated with two red vertical lines; predicted NANOGP1 deletion events are indicated with red triangles. (D) Amino acid alignment compares the homeodomain sequences of NANOGP1 orthologs. Colour indicates different types of amino acids, according to their biochemical properties. Asterisks indicate that the amino acid is the same for all aligned sequences. (E) ATAC-seq (Pastor et al., 2016) and ChIP-seq (Chovanec et al., 2021) profiles across the NANOG and NANOGP1 loci in naïve and primed hPSCs. The sequences labelled ‘a-d’ indicate two duplicated pairs of regulatory regions. (F) Comparison of the regulatory regions a-d. Left: individual dots represent matching base pairs between the two aligned sequences. Right: GC content ratio graphs in which the x-axis represents the length of a putative regulatory region in bp, and the y-axis shows GC content within 30 bp sliding windows. The average GC content ratios over the indicated regions are shown.
Fig. 4.
Fig. 4.
NANOGP1 encodes a protein that is expressed in human pluripotent cells. (A) CRISPR/Cas12a strategy to target NANOGP1 and insert in-frame V5 or 3xFLAG epitope tags. (B) Left: genotyping strategy with primer positions (arrows). Right: integration of the tags into the NANOGP1 locus in naïve hPSCs. WT, untransfected naïve hPSCs; V5 lane 1 and V5 lane 2, two independent lines with V5 integrated at the NANOGP1 locus; FLAG lane 1 and FLAG lane 2, two independent lines with 3xFLAG integrated at the NANOGP1 locus. (C) Nuclear localisation of V5-NANOGP1 in small colonies of polyclonal transgenic naïve hPSCs, and overlap with OCT4 and DAPI signal. White arrows indicate the V5-positive colony. The other visible colonies are V5 negative and presumably not successfully targeted. Scale bars: 100 µm. (D) Western blot of co-immunoprecipitation experiments. Protein samples from transgenic polyclonal naïve hPSCs were immunoprecipitated with either V5 (upper) or FLAG (lower) antibodies. The immunoprecipitated material was examined by western blot using antibodies against the epitope tag (left), the NANOG C terminus that also detects NANOGP1 (centre), and the NANOG N terminus that does not detect NANOGP1 due to an N-terminal deletion (right). The grey asterisks indicate that, due to the low number of NANOGP1-epitope tagged cells in the polyclonal population, the proteins were detected only in the immunoprecipitated samples and not in the input samples.
Fig. 5.
Fig. 5.
NANOGP1 has gene autorepressive activity. (A) Induction of NANOG-GFP and NANOGP1-GFP transgenes in naïve hPSCs, as monitored by GFP expression. RT-qPCR values are relative to HMBS expression and normalised to the 72 h+DOX samples. Mean and data points from three independent samples are shown. Unpaired t-test (two-tailed; ***P=0.0003, ****P<0.0001). (B) Western blot showing DOX-induced overexpression of NANOG and NANOGP1 in naïve hPSCs. (C,D) Endogenous NANOG and NANOGP1 expression levels in naïve hPSCs with DOX-inducible NANOG (C) and NANOGP1 (D) transgenes. Primers target the 5′UTR of either NANOG or NANOGP1. RT-qPCR values are relative to HMBS expression and normalised to the 18 h samples. Mean and data points from three independent samples are shown. Unpaired t-test (two-tailed; **P<0.01, ***P<0.001, ****P<0.0001). (E) Endogenous NANOG and NANOGP1 expression levels in primed hPSCs with DOX-inducible NANOGP1 transgene. Mean and data points from two independent samples are shown. Unpaired t-test (two-tailed; ***P<0.001). O/E, overexpression.
Fig. 6.
Fig. 6.
NANOGP1 is a strong inducer of naïve pluripotency. (A) Experimental design for transgene-induced primed to naïve hPSC reprogramming. (B) Expression of pluripotency markers in established naïve and primed hPSCs (left), and in cultures after 12 days of DOX-induced reprogramming (right). RT-qPCR values are relative to HMBS expression and normalised to naïve hPSCs (left) and to the NANOG+KLF2 sample (right). All three NANOGP1 isoforms were tested. Mean and data points from three independent experiments are shown. Right: one-way ANOVA with Dunnett's multiple comparisons test compared all samples with the KLF2-only sample (*P<0.05, **P<0.005, ***P<0.0005, ****P<0.00005). Left: unpaired t-test (two-tailed) compared the primed sample to the naïve samples (ns, not significant; ****P<0.00005. (C) Number of alkaline phosphatase-positive colonies after 12 days of DOX-induced reprogramming. Mean and data points from three reprogramming experiments are shown. (D) Flow cytometry of cell-surface markers in established naïve and primed hPSCs, and in cultures after 12 days of DOX-induced reprogramming. Naïve hPSCs (CD24 negative; CD75 positive; SUSD2 positive) are in the upper right quadrant of the final gate. (E) Summary of flow cytometry data from D for two independent reprogramming experiments. (F) Stable cell-surface marker expression in established NANOGP1+KLF2 (isoform 1) cell lines propagated in the absence of DOX in naïve hPSC medium for seven passages. (G) Experimental design for naïve to primed hPSC capacitation with enforced NANOGP1 expression. (H) Cell-surface marker expression in cultures after 1 and 6 days of capacitation in the absence and presence of DOX. Primed hPSCs (CD24 positive; SSEA4 positive) are in the upper right quadrants. (I) Expression of marker genes in cultures at days 0, 1 and 6 of capacitation in the absence and presence of DOX. RT-qPCR values are relative to HMBS expression and normalised to day 0. Lower right: percentage of dead cells as measured using Trypan Blue staining. Mean and data points from three independent experiments are shown. An unpaired, two-tailed t-test compared the No DOX with +DOX samples at each timepoint (**P<0.005, ****P<0.00005; all other data are not significant).
Fig. 7.
Fig. 7.
NANOG is required to maintain naïve pluripotency, but NANOGP1 is dispensable. (A) DOX-inducible dCas9-KRAB CRISPRi to suppress NANOG and NANOGP1 transcription in naïve hPSCs. (B) CRISPRi of NANOG (left) and NANOGP1 (right) in naïve hPSCs. RT-qPCR values are relative to HMBS expression and normalised to day 4 samples. Mean and data points from three independent samples. An unpaired t-test (two-tailed) for each ±DOX pair was performed (ns, not significant; ****P<0.00005). (C) Reduced NANOG levels after DOX-induced NANOG CRISPRi in naïve hPSCs. (D) Bright-field images of NANOG and NANOGP1 CRISPRi naïve hPSCs on day 0 and after 9 days of DOX treatment. Insets show representative colonies. Scale bars: 100 µm. (E) Expression of undifferentiated (left) and trophectoderm markers (right) in NANOG and NANOGP1 CRISPRi naïve hPSCs. Expression levels measured by RNA-seq are normalised to day 0 samples. Data are mean±s.d. from three independent samples. An unpaired t-test (two-tailed) with multiple testing correction was performed between each timepoint and the corresponding day 0 sample (ns, not significant; *P<0.05; **P<0.005; ***P<0.0005). (F) Expression in NANOG (upper) and NANOGP1 (lower) CRISPRi naïve hPSCs after DOX induction. Differentially expressed (DE) genes in blue [defined by a Wald test with Benjamini-Hochberg correction with a false discovery rate (FDR) of <0.05]. (G) RNA-seq data of NANOG CRISPRi naïve hPSCs with and without DOX over a 9-day timecourse (left) and also with NANOGP1 CRISPRi naïve hPSCs (right). Each data point is the average of three independent samples. (H) Left: transcriptomes of annotated human embryo lineages (Xiang et al., 2020; Rostovskaya et al., 2022). On these maps, the transcriptomes of NANOG (centre) and NANOGP1 (right) CRISPRi naïve hPSCs over a 9-day timecourse of DOX induction have been added. ICM, inner cell mass; TE, trophectoderm; CTB, cytotrophoblast; EVT, extravillous trophoblast; STB, syncytiotrophoblast; PreEPI, preimplantation epiblast; PostEPI, post-implantation epiblast; PostEPI-Gast, gastrulating stage; PostEPI-AME, post-implantation amniotic sac; AME, amniotic sac.

References

    1. Aken, B. L., Achuthan, P., Akanni, W., Amode, M. R., Bernsdorff, F., Bhai, J., Billis, K., Carvalho-Silva, D., Cummins, C., Clapham, P.et al. (2016). Ensembl 2017. Nucleic Acids Res. 45, D635-D642. 10.1093/nar/gkw1104 - DOI - PMC - PubMed
    1. Bailey, J. A., Gu, Z., Clark, R. A., Reinert, K., Samonte, R. V., Schwartz, S., Adams, M. D., Myers, E. W., Li, P. W. and Eichler, E. E. (2002). Recent segmental duplications in the human genome. Science 297, 1003-1007. 10.1126/science.1072047 - DOI - PubMed
    1. Barson, G. and Griffiths, E. (2016). SeqTools: visual tools for manual analysis of sequence alignments. BMC Res. Notes 9, 39. 10.1186/s13104-016-1847-3 - DOI - PMC - PubMed
    1. Booth, H. A. F. and Holland, P. W. H. (2004). Eleven daughters of NANOG⋆. Genomics 84, 229-238. 10.1016/j.ygeno.2004.02.014 - DOI - PubMed
    1. Bredenkamp, N., Stirparo, G. G., Nichols, J., Smith, A. and Guo, G. (2019a). The cell-surface marker sushi containing domain 2 facilitates establishment of human naive pluripotent stem cells. Stem Cell Rep. 12, 1212-1222. 10.1016/j.stemcr.2019.03.014 - DOI - PMC - PubMed

Publication types

MeSH terms