Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Jun 14;82(1):239.
doi: 10.1007/s00018-025-05770-1.

'Intelligent' proteins

Affiliations
Review

'Intelligent' proteins

Timir Tripathi et al. Cell Mol Life Sci. .

Abstract

We present an idea of protein molecules that challenges the traditional view of proteins as simple molecular machines and suggests instead that they exhibit a basic form of "intelligence". The idea stems from suggestions coming from Integrated Information Theory (IIT), network theory, and allostery to explore how proteins process information, adapt to their environment, and even show memory-like behaviors. We define protein intelligence using IIT and focus on how proteins integrate information (in terms of the parameter Φ coming from IIT) and balance their core (stable, ordered regions) and periphery (flexible, disordered regions). This balance allows proteins to remain stable while adapting to changes and operating in a critical state where order and disorder coexist. We summarize recent findings on conformational memory, allosteric regulation, protein intrinsic disorder, liquid-liquid phase separation, and critical transitions, and compare protein behavior to other complex systems like ecosystems and neural networks. While our perspective offers a unified framework to understand proteins, it also raises questions about applying intelligence concepts to molecular systems. We discuss how this understanding could advance protein engineering, drug design, and synthetic biology, while at the same time acknowledging the challenges of creating adaptive, "intelligent" proteins. This concept bridges the gap between mechanistic and systems-level views of proteins and offers a comprehensive understanding of their dynamic and adaptive nature. We have tried to redefine the traditionally metaphorical concept of "intelligence" in biochemistry as a measurable property while simultaneously establishing the material foundation of protein intelligence through the identification of fundamental elements such as memory and learning in molecular systems.

Keywords: Allostery; Conformational memory; Core-periphery dynamics; Critical States; Integrated information theory; Intrinsically disordered proteins; Liquid-liquid phase separation; Post-translational modifications; Protein intelligence.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Fig. 1
Fig. 1
Illustrative examples of proteins with varying levels of intrinsic disorder. Each panel displays an ensemble of structural models derived from solution NMR spectroscopy, capturing the conformational heterogeneity (or “fuzziness”) present in different regions of the protein. This structural fuzziness reflects the dynamic nature of protein conformations and is typically more pronounced in the periphery than in the core. In all cases, the core regions exhibit comparatively lower flexibility (have lower fuzziness), reinforcing the concept of a dynamic core-periphery organization. Notably, even highly disordered proteins, such as the chitin-binding domain from the beak of the jumbo squid Dosidicus gigas (PDB ID: 7BWO), retain a partially ordered core, illustrating that complete structural disorder is rare. The figure includes examples spanning a broad structural continuum, from mostly ordered to predominantly disordered proteins. These include the hemoglobin receptor HbpA from Corynebacterium diphtheriae (PDB ID: 9BCH; [69]), the TonB C-terminal domain from Helicobacter pylori (residues 179–285; PDB ID: 6SLY; [70]), the Williams-Beuren syndrome-associated methyltransferase WBSCR27 (PDB ID: 7QCC; [71]), the outer membrane protein AlkL (PDB ID: 6QAM; [72]), the N-terminal cytoplasmic domain of the membrane antisigma factor DdvA (PDB ID: 8RLZ; [73]), the MAX47 effector from Pyricularia oryzae (PDB ID: 7ZKD; [74]), the tRNA 2′-phosphotransferase from Runella slithyformis (PDB ID: 7 KW8; [75]), the chitin-active lytic polysaccharide monooxygenase BlLPMO10 A (PDB ID: 6 TWE; [76]), the barnacle cement protein MrCP20 (PDB ID: 6LEK; [77]), Gaussia luciferase (PDB ID: 7D2O; [78]), the antimicrobial peptide LaIT2 (PDB ID: 7 WKF; [79]), and the aforementioned chitin-binding domain (residues 163–223) from D. gigas (PDB ID: 7BWO; [80]). Together, these structures illustrate the wide spectrum of disorder in proteins and highlight the dynamic interplay between ordered cores and flexible peripheral regions
Fig. 2
Fig. 2
P vs. z plot for (a) a single protein and (b) for 1420 proteins. Notably, the plots exhibit a striking similarity across all proteins, highlighting a conserved pattern. The dataset used in this study was obtained from the protein-culling server PISCES. The authors selected a subset of protein structures that share less than 20% sequence identity and at a resolution better than 2.0 Å. Only monomeric entries (single chains) were included. An initial set of 1757 structures was downloaded and subsequently filtered to exclude entries with missing residues, resulting in a final dataset of 1420 high-quality structures. The module detection algorithm was applied to this curated set, and the results are accessible via a dedicated web server (http://gandivaweb.iab.keio.ac.jp). A complete list of the PDB IDs included in the analysis is provided in the Supporting Information of the paper [89]. Figure adapted with permission from Krishnan et al. [89]
Fig. 3
Fig. 3
Structure-function continuum of proteins. A single protein can adopt multiple combinations of ordered and disordered states, categorized as: (1) Mosaic architecture, comprising folded (foldons) and non-folded (non-foldons) regions; (2) Global semi-folded state, containing semi-structured regions (semi-foldons); (3) Inducible foldons, folding upon binding; (4) Morphing inducible foldons, adopting different folds with different partners; and (5) Unfoldons, requiring unfolding for activation. These dynamic states create a structural and functional continuum, and enable proteins to perform diverse roles over time
Fig. 4
Fig. 4
Structural model of a kinesin protein. The ribbon diagram depicts the 3D structural model of a kinesin motor protein (containing 1815 amino acid residues), with different functional regions. The model has been generated using AlphaFold 2.0. The structure contains the motor domain (responsible for ATP hydrolysis and microtubule binding), the neck linker (involved in force generation and directional movement), and the coiled-coil stalk (mediating dimerization and cargo binding) [58]. Notably, there are no clear boundaries (and no macroscopic differences in structure) between these functional regions, which reflects the integrated nature of protein function. This lack of discrete partitioning contrasts sharply with the modular design of synthetic machines, which underscores the unique principles of biological engineering
Fig. 5
Fig. 5
Schematic of a Ducati 250 GT engine. The diagram illustrates the internal structure of a Ducati 250 GT race motorbike engine, 1966, a classic example of synthetic mechanical engineering. Key components, such as the pistons, crankshaft, and valves, are clearly delineated to highlight their distinct functionalities. Unlike the kinesin molecule (Fig. 4), the engine exhibits a modular and compartmentalized design, with well-defined boundaries between functional parts. The elements of the engine interact while at the same time maintaining their independent and unique forms; the structure of the engine remains invariant during operation. It is not by chance that each component has a distinct label, as there is no ambiguity regarding the borders between different parts. This contrast emphasizes the fundamental differences between biological and synthetic systems, even when both are designed to achieve controlled and regular motion. The rigid, predefined architecture of the engine stands in stark contrast to the dynamic and integrated nature of protein-based molecular machines. Figure adapted with permission from De Paola et al. [81]

References

    1. Bruce A, Rebecca H, Alexander J, David M, Martin R, Keith R, Peter W (2022) Molecular biology of the cell: seventh international student edition with registration card. W.W. Norton & Company
    1. Zhang X, Zhang J, Wang Y, Wang M, Tang M, Lin Y, Liu Q (2022) Epigenetic modifications and neurodegenerative disorders: A biochemical perspective. ACS Chem Neurosci 13(2):177–184. 10.1021/acschemneuro.1c00701 - PubMed
    1. Schmaier AA, Zou Z, Kazlauskas A, Emert-Sedlak L, Fong KP, Neeves KB, Maloney SF, Diamond SL, Kunapuli SP, Ware J, Brass LF, Smithgall TE, Saksela K, Kahn ML (2009) Molecular priming of Lyn by GPVI enables an immune receptor to adopt a hemostatic role. Proc Natl Acad Sci U S A 106(50):21167–21172. 10.1073/pnas.0906436106 - PMC - PubMed
    1. Xu D, Phillips JC, Schulten K (1996) Protein response to external electric fields: relaxation, hysteresis, and echo. J Phys Chem 100(29). 10.1021/jp960076a
    1. Balduzzi D, Tononi G, Qualia (2009) The geometry of integrated information. PLoS Comput Biol 5(8):e1000462. 10.1371/journal.pcbi.1000462 - PMC - PubMed

LinkOut - more resources