Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 1;133(5):e159940.
doi: 10.1172/JCI159940.

CRISPR/Cas9 screen uncovers functional translation of cryptic lncRNA-encoded open reading frames in human cancer

Affiliations

CRISPR/Cas9 screen uncovers functional translation of cryptic lncRNA-encoded open reading frames in human cancer

Caishang Zheng et al. J Clin Invest. .

Abstract

Emerging evidence suggests that cryptic translation within long noncoding RNAs (lncRNAs) may produce novel proteins with important developmental/physiological functions. However, the role of this cryptic translation in complex diseases (e.g., cancer) remains elusive. Here, we applied an integrative strategy combining ribosome profiling and CRISPR/Cas9 screening with large-scale analysis of molecular/clinical data for breast cancer (BC) and identified estrogen receptor α-positive (ER+) BC dependency on the cryptic ORFs encoded by lncRNA genes that were upregulated in luminal tumors. We confirmed the in vivo tumor-promoting function of an unannotated protein, GATA3-interacting cryptic protein (GT3-INCP) encoded by LINC00992, the expression of which was associated with poor prognosis in luminal tumors. GTE-INCP was upregulated by estrogen/ER and regulated estrogen-dependent cell growth. Mechanistically, GT3-INCP interacted with GATA3, a master transcription factor key to mammary gland development/BC cell proliferation, and coregulated a gene expression program that involved many BC susceptibility/risk genes and impacted estrogen response/cell proliferation. GT3-INCP/GATA3 bound to common cis regulatory elements and upregulated the expression of the tumor-promoting and estrogen-regulated BC susceptibility/risk genes MYB and PDZK1. Our study indicates that cryptic lncRNA-encoded proteins can be an important integrated component of the master transcriptional regulatory network driving aberrant transcription in cancer, and suggests that the "hidden" lncRNA-encoded proteome might be a new space for therapeutic target discovery.

Keywords: Breast cancer; Genetics; Noncoding RNAs; Oncology; Translation.

PubMed Disclaimer

Figures

Figure 1
Figure 1. An integrative functional genomic strategy for identifying ER+ breast cancer dependency on cryptic lncRNA-encoded ORFs.
Workflow diagram depicting the integrative strategy for (A) predicting cryptic lncRNA-encoded ORFs from ribo-seq data and (B) identifying ER+ BC dependency on these predicted cryptic ORFs by using the CRISPR/Cas9 screen. (C) Scatter plot showing the statistical significance [–log10(P value)] and the magnitude of change [log2(fold change)] between day 21 and day 0, for the representative negatively selected sgRNAs of the corresponding ORFs. The blue dots correspond to the cryptic lncRNA-encoded ORFs with at least 1 significantly and negatively selected targeting sgRNA in the CRISPR/Cas9 screen and the red dots correspond to the ORFs meeting the criterion of the blue dots, whose host lncRNA expression was significantly upregulated in luminal A (LumA) BC in comparison with the corresponding normal breast tissues, based on TCGA RNA-seq data. Puro, puromycion.
Figure 2
Figure 2. RNA-seq–based expression of the lncRNA genes encoding the screen hits in tumors and normal tissues from TCGA and validation of the hits encoded by LINC00992/GATA3-AS1.
(A) Box-and-whisker plot showing the expression of the corresponding lncRNA genes that encode the 28 screen hits of cryptic ORFs and were upregulated in luminal BC tumors with respect to the normal breast tissues, based on TCGA data. The bottom and top edges of the box represent the lower and upper quartiles. The median marks the midpoint of the data and is shown by the line dividing the box into 2 parts. The whiskers represent the values between the bottom 5% and 25% or between the top 25% and 5%. The outliers are shown as points. The growth of MCF7 cells transduced with negative control sgRNA (sgNC) or gene-specific sgRNAs targeting (B) ORF-LINC00992 or (C) ORF-GATA3-AS1 was monitored via CCK-8 assay. The OD450 for the water-soluble tetrazolium 8 (WST-8) product formazan was measured each day for 4 days via CCK-8 assay. Data in B and C are shown as mean ± SD (n = 3). **P < 0.01 by 1-way ANOVA with Dunnett’s multiple-comparison test. NS, not significant (P > 0.05).
Figure 3
Figure 3. LINC00992 encodes an unannotated protein.
(A) Ribo-seq count profile of 3 replicates across the LINC00992-encoded ORF. The predicted ORF based on GENCODE v22 annotation (ENST00000504107.1) is labeled “original predicted ORF” and the ORF with the extended region identified by 5′ RACE is labeled “extended new ORF.” (B) Schematic of LINC00992 gene and transcript (ENST00000504107.2, GENCODE v39) structure, and the information about its encoded protein GT3-INCP. (C) In the presence/absence of the native 5′-UTR, the wild-type FLAG-tagged GT3-INCP or the mutant one (AGG mutation in start codon) was stably expressed in MCF7, T47D, and ZR75-1 cells and protein expression was determined by Western blot with anti-FLAG and anti–GT3-INCP antibodies, where β-actin was used as a loading control. (D) Endogenous GT3-INCP protein expression was determined by Western blot in the indicated ER+ BC cell lines that were transduced with the negative control sgRNA (sgNC) or gene-specific sgRNAs, where β-actin served as a loading control. (E) The regions of GT3-INCP with the MS-identified peptides from IP of both ectopic FLAG-tagged and endogenous GT3-INCP in ER+ BC cells are shown in green and the corresponding sequences are shown in red. (FI) The MS2 spectra of the GT3-INCP–derived tryptic peptides QERFPIILLSR and TDSFAGHLFSTAR detected by PRM-MS in the proteins coimmunoprecipitated with the anti-FLAG antibody from MCF7 (F and G) and T47D (H and I) cell lysates. Data in C and D are representative of 3 independent experiments.
Figure 4
Figure 4. GT3-INCP is upregulated in ER+ tumors and exerts a tumor-promoting function.
(A) qRT-PCR analysis of LINC00992 RNA expression (n = 3) and (B) Western blot analysis of GT3-INCP expression in the indicated breast epithelial cells and BC cell lines. For qRT-PCR analysis, GAPDH served as an internal control and all expression was relative to that in MCF10A cells. For Western blot analysis, β-actin served as an internal control. (C) Western blot analysis of GT3-INCP expression in ER+ luminal tumors (T) and the matched normal (N) breast tissue (n = 12). (D) The GT3-INCP protein level relative to that of β-actin was quantified by densitometry and plotted. (E) MCF7 and (F) ZR75-1 cells stably transduced with GT3-INCP that has a wild-type (ATG) or mutant (AGG) start codon or the empty vector (EV) control were transfected with the negative control siRNA (siNC) or LINC00992-targeting siRNAs. Cell growth was monitored for 4 days via CCK-8 assay. (G) Representative pictures of clonogenic growth and a bar graph quantifying the colonies formed by the MCF7 cells that were transduced with wild-type or mutant (AGG start codon) GT3-INCP or the EV control and were transfected with siNC or siRNAs targeting LINC00992. (H) Volume of the orthotopic tumors formed by the ZR75-1 cells that were stably transduced with 3 different combinations (n = 6 per combination): EV and shNC, EV and shLINC00992, or GT3-INCP and shLINC00992, was monitored as indicated in the Methods. Data are shown as mean ± SD; n = 3 (EG) or n = 6 (H). **P < 0.01 by 2-tailed, paired Student’s t test (D) or 1-way ANOVA with Tukey’s multiple-comparison test (EH). NS, not significant (P > 0.05). Data in B and C are representative of 3 independent experiments.
Figure 5
Figure 5. GT3-INCP interacts with GATA3.
(A) Silver staining showing the proteins enriched by co-IP of FLAG-tagged GT3-INCP (IP-GT3-INCP-Flag) compared with the negative control FLAG-tagged GFP (IP-GFP-Flag) in MCF7 cells. Whole-cell lysates of (B) HEK293FT cells transfected with HA-tagged GATA3 (HA-GATA3) and FLAG-tagged GT3-INCP (GT3-INCP-Flag), (C) MCF7 and T47D cells stably expressing GT3-INCP-Flag, or the chromatin-bound extracts of (D) MCF7/T47D cells or (E) cells stably expressing GT3-INCP-Flag were immunoprecipitated with the indicated antibodies, followed by immunoblot analysis. Rabbit or mouse IgG was used as a negative control. (F) Diagram illustrating different domains of the full-length GATA3 and 3 truncation mutants (S1–S3). (G) Lysates of HEK293FT cells cotransfected with HA-tagged wild-type or mutant GATA3 and GT3-INCP-Flag were immunoprecipitated with an anti-FLAG antibody or IgG and then analyzed by immunoblotting. (H) Diagram illustrating the deletion mutants generated from the full-length GT3-INCP (M1–M12). (I) Lysates of HEK293FT cells cotransfected with HA-tagged GATA3 and FLAG-tagged wild-type or mutant GT3-INCP were immunoprecipitated with an anti-FLAG antibody or IgG and then analyzed by immunoblotting. (J) MCF7 cells stably transduced with the empty vector control (EV) or the indicated ORFs were transfected with siNC or a LINC00992-targeting siRNA. Cell growth was monitored by CCK-8 assay. (K) MCF7 cells stably transduced with EV or the indicated ORFs were transfected with siNC or a LINC00992-targeting siRNA, and were then assessed for colony formation. Representative pictures of clonogenic growth and a bar graph quantifying the colonies formed by these cells are shown. Data in AE, G, and I are representative of 3 independent experiments. Data in J and K are shown as mean ± SD (n = 3). **P < 0.01 by 1-way ANOVA with Tukey’s multiple-comparison test. NS, not significant (P > 0.05).
Figure 6
Figure 6. GT3-INCP and GATA3 coregulate a common gene expression program.
Gene set enrichment analysis (GSEA) with the Hallmark gene sets showing the top enriched gene sets downregulated following (A) GT3-INCP knockout or (B) GATA3 knockdown. (C) Bar plot showing the top enriched Gene Ontology biological process (BP) terms and KEGG pathways ranked by –log10(P value), based on the functional enrichment analysis of protein-coding genes co-upregulated by GT3-INCP and GATA3. (D) Venn diagram showing the overlap between the genes downregulated and upregulated by GT3-INCP and GATA3. (E) Ideogram showing the chromosomal location/cytoband of the BC risk genes that are co-upregulated (red) or co-downregulated (blue) by GT3-INCP and GATA3. Those that are transcriptional factors/epigenetic regulators are shown in yellow. (F) The genome-wide distribution of GT3-INCP binding sites identified from ChIP-seq data in MCF7 cells. (G) The sequence logo of the top motif (human GATA3 motif) identified by motif enrichment analysis (Supplemental Methods) from the GT3-INCP binding sites. (H) Venn diagram showing the overlap between the GT3-INCP binding sites and high-confidence common GATA3 binding sites that were shared among 3 GATA3 ChIP-seq data sets (GSE32465 and GSE128460) in MCF7 and T47D cells. Fisher’s exact test was used to assess the statistical significance of the Venn diagram overlap (D and H).
Figure 7
Figure 7. GT3-INCP and GATA3 upregulate MYB and PDZK1 expression.
(A) Workflow for identifying the protein-coding genes co-upregulated by GT3-INCP and GATA3 and upregulated in luminal A BC compared with normal breast tissue. (B) Workflow for identifying the key targets that were potentially important for mediating the tumor-promoting function of the GT3-INCP/GATA3 axis in ER+ luminal BC. Venn diagram showing the overlap between the protein-coding genes that were co-upregulated by GT3-INCP/GATA3 and upregulated in luminal BC tumors, and the genes that harbored common GT3-INCP/GATA3 binding site(s). qRT-PCR analysis showing MYB and PDZK1 expression changes in MCF7 cells following (C) GATA3 knockdown or (D) GT3-INCP knockout. (E) Upon LINC00992 knockdown, the rescue effect of ectopic expression of the wild-type or mutant GT3-INCP (Del-M8 or AGG mutation in start codon), with respect to the empty vector control (EV), on MYB and PDZK1 mRNA expression was assessed by qRT-PCR in MCF7 cells. Fisher’s exact test was used to assess the statistical significance of the Venn diagram overlap (B). Data in CE are shown as mean ± SD (n = 3). **P < 0.01 by 1-way ANOVA with Dunnett’s multiple-comparison test. NS, not significant (P > 0.05).
Figure 8
Figure 8. GT3-INCP facilitates the binding of GATA3 to the common cis regulatory elements of MYB and PDZK1.
(A) The GT3-INCP and GATA3 ChIP-seq signal and peaks around MYB and PDZK1 in MCF7 and T47D cells. ChIP-qPCR validation of (B) GT3-INCP and (C) GATA3 binding to the ChIP-seq peaks around MYB and PDZK1 with the indicated antibodies in MCF7 cells stably expressing the FLAG-tagged GT3-INCP. (D) ChIP-qPCR analysis for assessing the effect of GT3-INCP knockout on GATA3 occupancy on its binding sites around MYB and PDZK1 in MCF7 cells. (E) Upon LINC00992 knockdown, ChIP-qPCR analysis was performed to assess the rescue effect of ectopic expression of wild-type GT3-INCP or mutant GT3-INCP (Del-M8 or AGG), with respect to the EV control, on the GATA3 occupancy on its binding sites in MCF7 cells. Data in BE are shown as mean ± SD (n = 3). **P < 0.01 by 2-tailed, unpaired Student’s t test (B and C) or 1-way ANOVA with Dunnett’s multiple-comparison test (D and E). NS, not significant (P > 0.05).
Figure 9
Figure 9. GT3-INCP is upregulated by estrogen/ER and is important for estrogen-dependent cell growth/gene expression.
(A) qRT-PCR analysis of LINC00992 RNA expression and (B) Western blot analysis of GT3-INCP expression in MCF7 and T47D cells upon β-estradiol (E2) treatment (30 nM) for the indicated time intervals. Western blot analysis of GT3-INCP expression (C) in MCF7/T47D cells transfected with ESR1-targeting endoribonuclease-prepared siRNA (esiRNA) or GFP-targeting esiRNA EGFP (esiEGFP), or (D) in the cells treated with 15 μM 4-hydroxytamoxifen (4-OHT; Sigma-Aldrich, SML1666) or vehicle (ethanol, ETOH) control. (E) qRT-PCR analysis of LINC00992 RNA expression in MCF7 cells that were transfected with the negative control siRNA (siNC) or LINC00992-targeting siRNAs (siLINC00992), after E2 (30 nM) or ETOH vehicle treatment. (F) After E2/ETOH treatment, the numbers of MCF7 cells treated with transfection reagent (control) or transfected with siNC or GATA3-targeting siRNAs (siGATA3) or siLINC00992 were counted every 24 hours for 72 hours. (G) After E2/ETOH treatment, the number of MCF7 cells that were treated with transfection reagent (control) or the MCF7 cells that were transduced with the empty vector (EV) or the indicated ORFs and transfected with siNC/siLINC00992 (siL) was monitored for 72 hours. (H) qRT-PCR analysis of MYB and PDZK1 RNA expression in MCF7 cells that were transfected with siNC, siGATA3, or siLINC00992, after E2/ETOH treatment. qRT-PCR analysis of (I) MYB and (J) PDZK1 RNA expression in the MCF7 cells that were treated with transfection reagent (control) or the MCF7 cells that were transduced with EV or the indicated ORFs and transfected with siNC/LINC00992-targeting siRNA (siL), after E2/ETOH treatment. Data in A and EJ are shown as mean ± SD (n = 3). **P < 0.01 by 1-way ANOVA with Dunnett’s multiple-comparison test. NS, not significant (P > 0.05). Data in BD are representative of 3 independent experiments.

Comment in

  • Cryptic lncRNA-encoded ORFs: A hidden source of regulatory proteins doi: 10.1172/JCI167271

References

    1. Consortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74. doi: 10.1038/nature11247. - DOI - PMC - PubMed
    1. Frankish A, et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 2019;47(d1):766–773. doi: 10.1093/nar/gky955. - DOI - PMC - PubMed
    1. Djebali S, et al. Landscape of transcription in human cells. Nature. 2012;489(7414):101–108. doi: 10.1038/nature11233. - DOI - PMC - PubMed
    1. Cabili MN, et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25(18):1915–1927. doi: 10.1101/gad.17446611. - DOI - PMC - PubMed
    1. Derrien T, et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 2012;22(9):1775–1789. doi: 10.1101/gr.132159.111. - DOI - PMC - PubMed

Publication types