Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Nov;21(7):531-535.
doi: 10.2174/1389202921999200611155418.

Hypothetical Proteins as Predecessors of Long Non-coding RNAs

Affiliations
Review

Hypothetical Proteins as Predecessors of Long Non-coding RNAs

Girik Malik et al. Curr Genomics. 2020 Nov.

Abstract

Hypothetical Proteins [HP] are the transcripts predicted to be expressed in an organism, but no evidence of it exists in gene banks. On the other hand, long non-coding RNAs [lncRNAs] are the transcripts that might be present in the 5' UTR or intergenic regions of the genes whose lengths are above 200 bases. With the known unknown [KU] regions in the genomes rapidly existing in gene banks, there is a need to understand the role of open reading frames in the context of annotation. In this commentary, we emphasize that HPs could indeed be the predecessors of lncRNAs.

Keywords: Hypothetical proteins; annotation; aptamers; functional genomics; lncRNA; transcripts.

PubMed Disclaimer

Figures

Fig. (1)
Fig. (1)
The figure showing the difference between a known-known and known unknown protein. Precisely, the characteristic domains such as domains of unknown function (DUF) or ORFs unrelated or KIAA domains are associated with hypothetical proteins, which usually are present in the c terminal region of the protein. We show a classic example of how CAC92745, an HP, could be annotated as a lncRNA, viz. LINC00208. (A higher resolution / colour version of this figure is available in the electronic copy of the article).
Fig. (2)
Fig. (2)
Osprey visualization showing 314 nodes [genes] and 224 edges [experimental system] with each node colour indicating their GO processes, viz. protein transport [pink], cell cycle [green], stress response [radical red], metabolism [royal blue], transport [Egyptian blue], signal transduction [grey], DNA metabolism [tortilla], transcription [azure], biogenesis [purple], RNA processing [lime], protein degradation [white], protein amino acid phosphorylation [brown], DNA damage response [yellow], autophagy [red] and function unknown [black]. (A higher resolution / colour version of this figure is available in the electronic copy of the article).

References

    1. Logan D.C. Known knowns, known unknowns, unknown unknowns and the propagation of scientific enquiry. J. Exp. Bot. 2009;60(3):712–714. doi: 10.1093/jxb/erp043. - DOI - PubMed
    1. Galperin M.Y., Nikolskaya A.N., Koonin E.V. Novel domains of the prokaryotic two-component signal transduction systems. FEMS Microbiol. Lett. 2001;203(1):11–21. doi: 10.1111/j.1574-6968.2001.tb10814.x. - DOI - PubMed
    1. Eisenstein E., Gilliland G.L., Herzberg O., Moult J., Orban J., Poljak R.J., Banerjei L., Richardson D., Howard A.J. Biological function made crystal clear-annotation of hypothetical proteins via structural genomics. Curr. Opin. Biotechnol. 2000;11(1):25–30. doi: 10.1016/S0958-1669(99)00063-4. - DOI - PubMed
    1. Sharma M., Vedithi S.C., Das M., Roy A., Ebenezer M. Sequence homology and expression profile of genes associated with DNA repair pathways in Mycobacterium leprae. Int. J. Mycobacteriol. 2017;6(4):365–378. doi: 10.4103/ijmy.ijmy_111_17. - DOI - PubMed
    1. Nimrod G., Schushan M., Steinberg D.M., Ben-Tal N. Detection of functionally important regions in “hypothetical proteins” of known structure. Structure. 2008;16(12):1755–1763. doi: 10.1016/j.str.2008.10.017. - DOI - PubMed